The Llama 3 series, released starting in April 2024, represents Meta AI’s third generation of open-access Large Language Models. This series marked significant advancements over Llama 2, including massively scaled pre-training data, architectural improvements like a new tokeniser, enhanced alignment techniques, the introduction of very large models, long context windows, and multimodal capabilities.
Llama 3
Released in April 2024 in 8B & 70B sizes.
- Massively Scaled Pre-training Data:
- Pre-trained on over 15 trillion tokens of data, a significant increase from Llama-2’s 2 trillion tokens.
- Data sourced from publicly available online sources, with extensive filtering and curation efforts. Included a larger proportion of non-English data (over 5%) and code compared to Llama-2, aiming for improved multilingualism and reasoning/coding skills.
- New Tokeniser:
- Introduced a new Tokeniser using Byte Pair Encoding (BPE) with a 128,000 token vocabulary (up from Llama 2’s 32,000).
- This larger vocabulary significantly improves tokenisation efficiency, leading to better language encoding and potentially better performance, especially for multilingual contexts.
- Architectural Refinements:
- Improved Instruction Following and Alignment:
- Utilised a combination of SFT, rejection sampling, PPO, and DPO for post-training alignment. The paper details enhancements to these methods for better instruction following and model behaviour.
- Showed significantly improved performance on benchmarks measuring reasoning, coding, and instruction following compared to Llama-2 and other contemporary open models.
- Enhanced Safety and Trust Features:
- Released alongside updated safety tools like Llama Guard 2, Code Shield (a filter for insecure code suggestions), and CyberSec Eval 2 (a benchmark for cybersecurity risks).
Llama 3.1
Released in July 2024 in 405B size.
- Massive Model Scale:
- Introduced a 405 billion parameter model, significantly larger than previous Llama models, trained on a dataset comparable in scale (15T+ tokens) to Llama 3.0.
- Extended Context Length:
- Increased the maximum context length dramatically to 128,000 tokens (from 8k in Llama 3).
- Demonstrated strong performance on long-context tasks and benchmarks.
- State-of-the-Art Performance:
- Achieved leading performance among open models on a wide range of industry benchmarks, becoming competitive with top closed-source models available at the time.
Llama 3.2
Released in September 2024 in 11B & 90B for multimodal and 1B & 3B for edge.
- Introduction of Multimodality:
- Launched Llama 3.2 Vision models, marking the series’ expansion into multimodal models.
- These Vision Language Model (VLM) variants can process and interpret both text and image inputs.
- Edge AI Focus:
- Also announced smaller Llama 3.2 models designed for on-device and edge computing applications, emphasising efficiency.
Llama 3.3
Released in November 2024 updating the 70B size.
- Iterative Refinement:
- Released an updated version of the 70B parameter model (Llama 3.3 70B Instruct).
- Likely incorporates architectural tweaks, data improvements, or refined alignment techniques based on learnings from the 3.1 and 3.2 releases.
- Aimed at further improving performance, potentially in areas like coding, reasoning, or specific instruction following nuances.
References
- Papers:
- Blog Posts:
- Models / Code: