DeepSeek 67B is a well trained LLaMA model. The main purpose of this model was to explore and report on scaling laws.
References
DeepSeek 67B is a well trained LLaMA model. The main purpose of this model was to explore and report on scaling laws.
References