ML Notes

home

❯

models

❯

DeepSeek

❯

DeepSeek

DeepSeek

14 Mar 20251 min read

  • deepseek
  • llama

DeepSeek 67B is a well trained LLaMA model. The main purpose of this model was to explore and report on scaling laws.

References

  • DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Graph View

Backlinks

  • DeepSeek
  • Multi-head Latent Attention (MLA)

Created with Quartz v4.5.0 © 2025