Meta releases Llama 3.1 405B and the open-weight frontier sits, briefly, at parity
An almost half-trillion-parameter model with downloadable weights, benchmark numbers competitive with GPT-4o and Claude 3.5, and a 92-page paper that left almost nothing about the recipe to the reader's imagination.

On 23 July 2024, Meta released Llama 3.1 in three sizes: 8B, 70B and 405B. The 405B variant was the headline. The model had been previewed in April but not made publicly available. With the July release, Meta provided downloadable weights, a near-permissive licence, and a ninety-two-page technical paper detailing data composition, training infrastructure, post-training methodology and evaluation protocols. The paper was, by the standards of the field at that point, a recovery of the open-research norms that had eroded after GPT-4.
The benchmark numbers, reported in the launch post and reviewed in detail by Wired and The Verge, sat at parity with GPT-4o and Claude 3.5 Sonnet on most public evaluations. As MIT Technology Review noted in coverage that week, this was the first time a model with downloadable weights had landed at the publicly visible frontier of capability. Sundry asterisks applied to the comparison, but the directional shift was unambiguous.
What the paper actually disclosed
The technical paper covered the training-data composition (15.6 trillion tokens, with public-web, code and multilingual mixtures detailed), the training infrastructure (a 16,000-H100 cluster, with concrete numbers on hardware-failure rates), the post-training pipeline (supervised fine-tuning, rejection sampling and direct preference optimisation, with the rejection-sampling loop described in unusual detail), and the evaluation harness. The paper did not disclose the precise dataset URLs, but provided enough information for a sufficiently capitalised lab to attempt to replicate the recipe.
Open weights at the frontier. Not a near-frontier. The frontier.
The licence asterisk
The Llama 3.1 licence retained the Llama 2-style commercial-use restriction for entities above seven hundred million monthly active users. As The Information's Sahil Patel observed in a sharp piece that week, the practical effect was a permissive licence for everyone except the four or five companies whose competitive pressure most plausibly motivated Meta's openness strategy in the first place. The Open Source Initiative continued to hold that the licence was not technically open-source. Most engineers continued to refer to it as open.



Discussion