The first models in the new Llama 4 herd of models—Llama 4 Scout 17B and Llama 4 Maverick 17B—are now available on AWS. You can access Llama 4 models in Amazon SageMaker JumpStart. These advanced multimodal models empower you to build more tailored applications that respond to multiple types of media. Llama 4 offers improved performance at lower cost compared to Llama 3, with expanded language support for global applications. Featuring mixture-of-experts (MoE) architecture, these models deliver efficient multimodal processing for text and image inputs, improved compute efficiency, and enhanced AI safety measures.
According to Meta, the smaller Llama 4 Scout 17B model, is the best multimodal model in the world in its class, and is more powerful than Meta’s Llama 3 models. Scout is a general-purpose model with 17 billion active parameters, 16 experts, and 109 billion total parameters that delivers state-of-the-art performance for its class. Scout significantly increases the context length from 128K in Llama 3, to an industry leading 10 million tokens. This opens up a world of possibilities, including multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast code bases. Llama 4 Maverick 17B is a general-purpose model that comes in both quantized (FP8) and non-quantized (BF16) versions, featuring 128 experts, 400 billion total parameters, and a 1 million context length. It excels in image and text understanding across 12 languages, making it suitable for versatile assistant and chat applications.
Meta’s Llama 4 models are available in Amazon SageMaker JumpStart in the US East (N. Virginia) AWS Region. To learn more, read the launch blog and technical blog . These models can be accessed in the Amazon SageMaker Studio .