The AI Desk
THURSDAY, 18 APRIL 2024 From the desk of Amit Singhal Vol. I · The ChatGPT Era
All news RESEARCH

Meta releases Llama 3 and the open-weight thesis gets its strongest evidence yet

Two model sizes, a permissive licence, and benchmarks competitive with closed-API leaders from a year earlier. Meta's bet on open-weights stops looking eccentric.

Meta releases Llama 3 and the open-weight thesis gets its strongest evidence yet

On 18 April 2024, Meta released the Llama 3 family in two sizes, 8B and 70B parameters, both with open weights and a near-permissive commercial licence. As Mark Zuckerberg wrote in a same-day post, the company's strategic thesis was that the AI infrastructure layer should be open by default and that Meta's commercial position came from products built on top, not from holding the model weights themselves.

Open-source code on a developer's laptop
Llama 3 shipped with weights and a near-permissive licence.Photo: Bernd 📷 Dittrich / Unsplash

What the 70B variant could actually do

The 70B model's benchmark scores were the headline. As reported by The Verge and TechCrunch, Llama 3 70B scored within a few points of GPT-3.5 and Claude 2 on standard evaluations including MMLU and GSM8K, while running on infrastructure available to any organisation that could afford a small H100 cluster. The 8B variant, more importantly for production, was a model that ran comfortably on a single high-end consumer GPU.

Open-weight vs closed-API: MMLU scores at April 2024
5-shot, public reporting
GPT-4 (Mar 2023) 86.4 % Claude 3 Opus 86.8 % Llama 3 70B 79.5 % Claude 3 Sonnet 79 % Llama 2 70B 68.9 % Llama 3 8B 66.6 %

The 405B variant, larger and more capable, was previewed in the launch post but not yet released. The wait would last until July.

The 70B variant in particular was what most enterprises actually deployed. The 8B was studied. The 405B was admired.

The licence question, again

Meta's Llama licence had been a recurring point of contention since the Llama 2 release. The terms permitted commercial use by entities below seven hundred million monthly active users; above that threshold, a separate licence with Meta was required. As Hugging Face's Julien Chaumond pointed out in a post that week, the practical effect was a near-permissive licence for ninety-nine per cent of potential adopters and an explicit competitive carve-out for the four or five companies large enough to matter strategically. The Open Source Initiative's formal position, articulated later that year, was that the licence did not meet the OSI's open-source definition. Most working developers used the term "open" anyway.

Originally reported by Meta (Meta) on 18 April 2024. Read the original report →
← Previous
Anthropic ships the Claude 3 family and reclaims a credible frontier position
Next →
OpenAI launches GPT-4o and demonstrates real-time voice the day before Google I/O

Discussion

Email used only for your avatar. Never shown, never stored in plain text.