Sukonbu

What Is Deepseek?

Deepseek is a Chinese AI company which recently released their newest model called Deepseek-R1 that made articles around the globe.

Deepseek-R1 is a brand new model that is able to reason, meaning instead of doing a task like a regular model, it first think of what to do and then takes action.

The only other model that was capable of achieving this with reasonable accuracy was OpenAI’s gpt4-o model, which, unlike Deepseek-R1, is closed-source.

After the model was released, Deepseek released a substential amount of papers about the model, explaining how they were able to achieve this success with their model. OpenAI however, claims that Deepseek used output from their gpt4-o model in order to create their cold start data. At the time of writing this there hasn’t been an official announcement about these claims yet but Deepseek claims that the cold start data they used came from their old model Deepseek V2

Deepseek has released multiple lower parameter versions of their model including 1.5B, 7B and 8B parameter models. This allows users who don’t have dedicated hardware or memory to run bigger models such as Meta’s Llama to run Deepseek-R1 with great performance albeit at a reduced accuracy.

For people who would like to test this model out, Deepseek has a public instance available. In addition it is possible to run the model locally using Ollama.

Reply to this post by email ↪