Today’s post is inspired by the amazing performance of DeepSeek’s latest code language models — the DeepSeek Coder.
It is the best open-source code Language Model (LLM) out there in the market today.
As shown below, the DeepSeek-Coder-Base-33B model significantly outperforms one of the leading open-source models, CodeLlama, across various benchmarks including HumanEval Python and Multilingual, MBPP and DS-1000.
With instruction tuning, the DeepSeek-Coder-Instruct-33B model not only beats the GPT 3.5 for HumanEval but also tremendously narrows the gap with the dominating GPT-4, currently the market leader in code generation. Additionally, it shows comparable performance for MBPP. 🚀💻
The details of the model specs are as follows:
- “Pretrained on 2 Trillion tokens over more than 80 programming languages.
- Various model sizes (1.3B, 5.7B, 6.7B and 33B) to support different requirements.
- A window size of 16K window size, supporting project-level code completion and infilling.
- State-of-the-Art performance among open code models.
- Open source and free for research and commercial use.”
The models were trained on project-level code corpus with 16k window size, making them really good with project-level code completion and infilling.
For a prompt like this: “
write a quick sort algorithm in python", the code will output the following:
The models support numerous programming languages including the following:
Try It Out!
If you would like to try it out, the easiest way is to use the UI directly:
🛠️✨ Happy practicing and happy building! 🚀🌟
Source of images/gif: DeepSeek Github