PinnedPublished inAWS in Plain EnglishServerless compute for LLM — with a step-by-step guide for hosting Mistral 7B on AWS LambdaOne of the challenges with using LLM in production is finding the right way to host the models in the cloud. GPUs are expensive so hosting…Nov 16, 20237Nov 16, 20237
PinnedPublished inAWS in Plain EnglishGuide for running Llama 2 using LLAMA.CPP on AWS FargateStep-by-step guide for deploying Llama 2 model to AWS using LLAMA.CPP as framework, Fargate for hardware and Copilot for deployment.Oct 17, 20232Oct 17, 20232
PinnedPublished inAWS TipHow AWS Lambda SnapStart eliminates cold starts for Serverless Machine Learning InferenceHow AWS Lambda SnapStart Removes cold starts for Serverless Machine Learning Inference by loading the model into the snapshot.Nov 29, 2022Nov 29, 2022
Published inAWS in Plain EnglishSpeeding Up Large Language Model (LLM) Inference with AWS Lambda SnapStartThis week AWS announced AWS Lambda SnapStart for Python and .NET functions. That’s a big deal. Why? One of the main challenges with…1d ago1d ago
Published inAWS in Plain EnglishTop 20 AWS re:Invent 2024 breakout sessions to stay on top of everything related to generative AIGenerative AI continues to transform industries, making it crucial for businesses to stay updated on the latest advancements and best…Sep 25Sep 25
Published inAWS in Plain EnglishMaking LLMs Scalable: Cloud Inference with AWS Fargate and CopilotIn this blog post, I’ll take you through a step-by-step guide on how to get LLMs up and running in the cloud with AWS Fargate and Copilot.Aug 7, 2023Aug 7, 2023
Published inAWS TipScalable Cloud inference endpoint using ONNX and AWS FargateAs a Machine Learning Engineer, I often find myself deploying models to the cloud. It’s an essential part of getting our models out of the…May 22, 2023May 22, 2023
ML No/Low Code services and Use Cases @ Re:Invent 2022Machine Learning is becoming essential for a lot of industries, but setting ML projects for success is challenging and requires business…Nov 14, 2022Nov 14, 2022
12 MLOps breakout sessions I’m looking forward to at Re:Invent 2022MLOps is a field of best practices for companies to run ML workflows in production. It encompasses a large stack of tasks from optimizing…Nov 8, 2022Nov 8, 2022
7 things to know before using AWS PanoramaMachine learning is becoming essential for a lot of companies and they want to use it to optimize their operations and make new services…Nov 26, 2021Nov 26, 2021