ML Notes

❯

❯

DeepSeek

14 Mar 20251 min read

DeepSeek is a Chinese lab that is owned and funded by the Chinese hedge fund High-Flyer.

Models

DeepSeek has release many impressive models, most notably DeepSeek-R1. In chronological order they are:

DeepSeek (November 2023)
- Explored scaling laws
DeepSeekMoE (January 2024)
- Introduced fine-grained expert segmentation
- Introduced shared expert isolation
DeepSeekMath (April 2024)
- Introduced GPRO
DeepSeek-V2 (June 2024)
- Combined DeepSeekMoE and GPRO
- Introduced MLA
DeepSeek-V3 (December 2024)
- Built upon DeepSeek-V2
- Introduced MTP
DeepSeek-R1 (January 2025)
- Built upon DeepSeek-V3
- Used GRPO to discover reasoning

6 items under this folder.

14 Mar 2025
DeepSeek
- deepseek
- llama
14 Mar 2025
DeepSeek-R1
14 Mar 2025
DeepSeek-V2
- deepseek
- mixture-of-experts
14 Mar 2025
DeepSeek-V3
- deepseek
- mixture-of-experts
14 Mar 2025
DeepSeekMoE
- deepseek
- mixture-of-experts
14 Mar 2025
DeepSeekMath
- deepseek

Created with Quartz v4.5.0 © 2025