Please correct if any of this is not accurate, but I feel like it could help to distill the actual facts. Fwiw, this isn't AI generated, just my own rambling facts :D
My summary of what is happening:
DeepSeek cost (supposedly?) a fraction of the cost of other large models to train it
DeepSeek's hosted app in China is basically free with unlimited consumer use
DeepSeek's API costs are also a fraction of other models
Problem:
If you use the hosted interface, there is zero data privacy protection
If you use the API, there is zero data privacy protection (vs other US providers models that will sign BAAs)
Local running costs:
- If you want to run the full DeepSeek model (NOT distilled) locally, it would cost a couple hundred K in hardware, which realistically can still only serve maybe a dozen concurrent users.
Question:
1) Whats the big deal you can run DeepSeek distilled locally? It's only a few billion parameters for non-high-end hardware. You can already do this with plenty of decent other offline models.
2) If the hardware cost to run and serve the full model are essentially the same as running the latest comparable GPT model, how are DeepSeek's api costs so low? The only answer I can come up with is they just have a huge amount of government provided hardware and this is a loss leader nation sponsored play. No big mystery or innovation.
Meaning they are doing nothing special when it comes to inference compute and literally the only (but still significant) point of interest that is panicking major llm companies is how did they train the model so cheaply?
3) Couldn't they just have lied about the cost to train it? Is there evidence from the model source that would confirm?
4) Why is this affecting Nvidia? It sounds like we still need the exact same hardware to run this model.
Just want to make sure I'm understanding correctly.
[link] [comments]