Machine Learning Inference on AWS Lambda Functions powered by AWS Graviton2 Processors

  • use cases of using serverless for inference
  • how, by using Graviton2, it becomes both faster and cheaper to use it for making predictions
  • benchmarks and the link to the repository with code and libraries where you can try them with your models

Serverless for machine learning

To deploy a model in production, we need to adhere to multiple requirements — time, cost, and scale. Usually, it’s pretty hard to satisfy all three and you may need to prioritize some of them based on the context. For example, a GPU cluster will provide the best speed of the prediction, but at the same time, it may be expensive since you would need to pay for idle time and it would be hard to scale in case you have peak loads.

  • You really need to optimize speed for the response time
  • You have a model which requires GPU or a lot of RAM

Graviton2 AWS Lambda update

AWS Graviton Processor is custom built by AWS and utilizes 64-bit Arm Neoverse cores. AWS Graviton2 provides a bigger performance boost compared to x86 architecture. With the announcement of Graviton2 availability on AWS Lambda, more use cases will become serverless friendly as jobs will become both faster and cheaper (up to 34% price-performance improvement and 20% cheaper per GB-s).

Benchmarks and links to the repo

We will use the following settings for the Lambda ML benchmark:

  • Framework: Scikit-Learn
  • Data: Binary classification dataset generated synthetically
  • Model: SVM classifier
  • Train: 512 samples with different number of features
  • Test: 1024 samples with different number of features
  • Number of cycles: 40
  • Lambda Configuration: 10GB RAM, 5-minute limit
  • Training on average takes the same or less time on Graviton2 compared to x86-based Lambda
  • Inference takes more time on Graviton2 Lambda when the number of features is small and becomes faster than x86-based Lambda when the number of features is large
  • Graviton2 is definitely better at complex operations than x86-based Lambda by being both faster and cheaper.
  • Depending on your case you may need to use one or another so the best option is to run your workflow on both and check which one performs better. To check Lambda performance on your case feel free to use libraries from the repo https://github.com/ryfeus/lambda-packs

Conclusion

We’ve looked into the use cases for Machine Learning inference and what are the updates to AWS Lambda and how they can be used for Machine Learning. Finally, we compared Normal and Graviton2 Lambda for training machine learning models and making predictions.

--

--

I'm a senior machine learning engineer at Instrumental, where I work on analytical models for the manufacturing industry, and AWS Machine Learning Hero.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Rustem Feyzkhanov

Rustem Feyzkhanov

I'm a senior machine learning engineer at Instrumental, where I work on analytical models for the manufacturing industry, and AWS Machine Learning Hero.