Llama 4 Behemoth

Meta 🧠 The Thinker Released April 2025

Meta's flagship multimodal model (RESEARCH PREVIEW ONLY - weights not publicly released)

Llama 4 Behemoth

Meta • April 2025

Training Data

Up to August 2024

Llama 4 Behemoth

April 2025

Parameters

~2 trillion (288B active)

Training Method

Mixture of Experts

Context Window

1,000,000 tokens

Knowledge Cutoff

August 2024

Key Features

Research Preview • Massive MoE Architecture • Multimodal • Not Publicly Released

Capabilities

Reasoning: Outstanding

STEM: Outstanding

Complex Tasks: Outstanding

What's New in This Version

Massive MoE model with 16 experts - announced April 2025, weights not yet publicly available

Meta's flagship multimodal model (RESEARCH PREVIEW ONLY - weights not publicly released)

What's New in This Version

Massive MoE model with 16 experts - announced April 2025, weights not yet publicly available

Technical Specifications

Parameters ~2 trillion (288B active)

Context Window 1,000,000 tokens

Training Method Mixture of Experts

Knowledge Cutoff August 2024

Training Data Up to August 2024

Key Features

Research Preview Massive MoE Architecture Multimodal Not Publicly Released

Capabilities

Reasoning: Outstanding

STEM: Outstanding

Complex Tasks: Outstanding

Other Meta Models

Explore more models from Meta

Muse Spark

Meta Superintelligence Labs' first model — a natively multimodal reasoning model with contemplating mode and multi-agent orchestration

April 2026 Not disclosed

Llama 4 Maverick

Meta's balanced multimodal MoE model with 128 experts for general use

April 2025 400 billion (17B active)

Llama 4 Scout

Meta's efficient multimodal model with industry-leading 10M token context

April 2025 109 billion (17B active)

Official Documentation Compare with Other Models View Timeline All Meta Models