The AI Desk
TUESDAY, 23 JULY 2024 From the desk of Amit Singhal Vol. I · The ChatGPT Era
All news RESEARCH

Meta releases Llama 3.1 405B and the open-weight frontier sits, briefly, at parity

An almost half-trillion-parameter model with downloadable weights, benchmark numbers competitive with GPT-4o and Claude 3.5, and a 92-page paper that left almost nothing about the recipe to the reader's imagination.

Meta releases Llama 3.1 405B and the open-weight frontier sits, briefly, at parity

On 23 July 2024, Meta released Llama 3.1 in three sizes: 8B, 70B and 405B. The 405B variant was the headline. The model had been previewed in April but not made publicly available. With the July release, Meta provided downloadable weights, a near-permissive licence, and a ninety-two-page technical paper detailing data composition, training infrastructure, post-training methodology and evaluation protocols. The paper was, by the standards of the field at that point, a recovery of the open-research norms that had eroded after GPT-4.

The benchmark numbers, reported in the launch post and reviewed in detail by Wired and The Verge, sat at parity with GPT-4o and Claude 3.5 Sonnet on most public evaluations. As MIT Technology Review noted in coverage that week, this was the first time a model with downloadable weights had landed at the publicly visible frontier of capability. Sundry asterisks applied to the comparison, but the directional shift was unambiguous.

Fibre-optic cables in a data centre, glowing
Llama 3.1 405B was trained on more than 16,000 H100 GPUs.Photo: boris misevic / Unsplash

What the paper actually disclosed

The technical paper covered the training-data composition (15.6 trillion tokens, with public-web, code and multilingual mixtures detailed), the training infrastructure (a 16,000-H100 cluster, with concrete numbers on hardware-failure rates), the post-training pipeline (supervised fine-tuning, rejection sampling and direct preference optimisation, with the rejection-sampling loop described in unusual detail), and the evaluation harness. The paper did not disclose the precise dataset URLs, but provided enough information for a sufficiently capitalised lab to attempt to replicate the recipe.

Frontier-class models, MMLU at July 2024
5-shot reported scores
GPT-4o 88.7 % Claude 3.5 Sonnet 88.7 % Llama 3.1 405B 88.6 % Llama 3.1 70B 86 % Llama 3.1 8B 73 %
Open weights at the frontier. Not a near-frontier. The frontier.

The licence asterisk

The Llama 3.1 licence retained the Llama 2-style commercial-use restriction for entities above seven hundred million monthly active users. As The Information's Sahil Patel observed in a sharp piece that week, the practical effect was a permissive licence for everyone except the four or five companies whose competitive pressure most plausibly motivated Meta's openness strategy in the first place. The Open Source Initiative continued to hold that the licence was not technically open-source. Most engineers continued to refer to it as open.

Originally reported by Meta AI (Meta AI) on 23 July 2024. Read the original report →
← Previous
Anthropic ships Claude 3.5 Sonnet and the working model for serious developers changes
Next →
The EU AI Act enters into force, and three years of implementation work begin in earnest

Discussion

Email used only for your avatar. Never shown, never stored in plain text.