Meta Introduces Llama 4 Scout, Llama 4 Maverick AI Models: Everything to Know

Meta has introduced two new Llama 4 AI models including Llama 4 Scout and Llama 4 Maverick.

7 April 2025

Meta has introduced its latest Llama 4 AI models to rival OpenAI’s and Google’s latest models. Llama 4 Scout, a 17 billion active parameter model with 16 experts, is claimed to be “the best multimodal model in the world in its class and is more powerful than all previous generation Llama models”.

Meta has introduced Llama 4 Scout and Llama 4 Maverick, the first open-weight natively multimodal models with a high context length support and the company’s first built using a mixture-of-experts (MoE) architecture. It is also previewing Llama 4 Behemoth, one of the smartest LLMs in the world and its most powerful yet to serve as a teacher for its new models.

Meta says that Llama 4 Scout offers an industry-leading context window of 10M and delivers better results than Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across a broad range of widely reported benchmarks. The Llama 4 Maverick is a 17 billion active parameter model with 128 experts. The Scout fits on a single H100 GPU (with Int4 quantization) while the latter fits on a single H100 host.

The Llama 4 Behemoth is touted to have outperformed GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks such as MATH-500 and GPQA Diamond. However, Meta isn’t releasing Behemoth just yet as it is still in the training phase.

Llama 4 models are built with native multimodal capabilities, using early fusion to naturally combine text and visual inputs within a single unified architecture. This early fusion approach is a big leap forward—it allows the model to be pre-trained on massive amounts of unlabeled data across text, images, and video all at once. Meta has also upgraded the vision encoder in Llama 4. While it’s based on MetaCLIP, the company trained it separately alongside a frozen Llama model, helping it better align with the language model’s needs.

For Llama 4, Meta adopted a “mixture of experts” (MoE) architecture—a setup that helps save computing power by activating only the specific parts of the model needed for each task. As for Llama 4 Behemoth, it is also a multimodal mixture-of-experts model, with 288B active parameters, 16 experts, and nearly two trillion total parameters.

One can download the Llama 4 Scout and Llama 4 Maverick models on llama.com and Hugging Face. Users can try Meta AI built with Llama 4 in WhatsApp, Messenger, Instagram Direct, and on the Meta.AI website.

Meta Introduces Llama 4 Scout, Llama 4 Maverick AI Models: Everything to Know

Gemini 2.5 Pro Released as Google’s Most Intelligent AI Model Yet

OpenAI and Meta Are Exploring AI Partnerships With Reliance in India

Google Begins Rolling Out Live Video and Screen-Sharing in Gemini

Google Introduces Audio Overviews, Canvas in Gemini

Google Introduces Gemini-Powered Add to Calendar Shortcut in Gmail

Google Introduces AI Mode in Search, Gemini 2.0 in AI Overviews

Latest News

WhatsApp Tests Mute and Camera Off Buttons for WhatsApp Audio and Video Calls

CMF Phone 2 Details: What to Expect?

Report: Samsung to Launch Galaxy Z Fold 7 and Z Flip 7 with One UI 8

Poco C71 Launched in India with Unisoc Chipset

Crypto News

Former FTX CEO’s bail denied, Will go to jail

Metaverse – The Future of Banking

Digital Rupee Project Kickstarted: Detailed FAQ to help you understand e-Rs

Pepsi Black Zero Sugar NFT collection launched for India