DeepSeek is a Chinese lab that is owned and funded by the Chinese hedge fund High-Flyer.

Models

DeepSeek has release many impressive models, most notably DeepSeek-R1. In chronological order they are:

  1. DeepSeek (November 2023)
    • Explored scaling laws
  2. DeepSeekMoE (January 2024)
    • Introduced fine-grained expert segmentation
    • Introduced shared expert isolation
  3. DeepSeekMath (April 2024)
    • Introduced GPRO
  4. DeepSeek-V2 (June 2024)
    • Combined DeepSeekMoE and GPRO
    • Introduced MLA
  5. DeepSeek-V3 (December 2024)
    • Built upon DeepSeek-V2
    • Introduced MTP
  6. DeepSeek-R1 (January 2025)
    • Built upon DeepSeek-V3
    • Used GRPO to discover reasoning