So what’s all the hoopla about DeepSeek and why is it breaking everybody’s brain right now in Ai? I’ve been doing a dive for a couple of days and these are the main deets I’ve pulled together, will have a Guy’s Take on it soon, so stay tuned to the @npub1hw4z...lg0q feed DeepSeek ELI5: • US has been hailed as the leader in Ai, while pushing fears that we need to be closed and not share with China cuz evil CCP and they can’t figure it out without us • ChatGPT and “Open”Ai is poster child, eating up retarded amounts of capital for training and inference (using) LLMs. Estimates say around $100 million or more for ChatGPT o1 model. • In just a couple of weeks China drops numerous open source models with incredible results, Hunyuan for video, Minimax, and now DeepSeek. All open source, all insanely competitive with the premiere closed source in the US. • DeepSeek actually surpassed ChatGPT o1 on most benchmarks, particularly math, logic, and coding. • DeepSeek is also totally open with how its thought process works, it explains and shows its work as it runs, while ChatGPT makes that proprietary. This makes building with, troubleshooting, and understanding with DeepSeek much better. • DeepSeek is also multimodal, so you can give it PDFs, images, connect it to the internet, etc. it’s a literal full personal assistant with just a few tools to plug into it. • The API costs 95% LESS than ChatGPT API per call. They claim that is a profitable price as well, while OpenAi is bleeding money. • They state that DeepSeek cost only $5.6 million to train and operate. • Capital controls on GPUs and chips went into effect in the past year or two trying to prevent China from “catching up,” and it seems to have failed miserably. As it seems China was able to do 20x the results per dollar with inferior hardware. • The US model of Ai, its costs, its capes structure, and the massive demand for chips has been the model for assessing the valuation, pricing, and future demand of the entire Ai industry. DeepSeek just took a giant dump on all of it by out performing and spending a tiny fraction to achieve it while also dealing with lack of access to the newest chips. All of this together is why people are freaking out about a plummet to Nvidia price, reevaluation of OpenAi, and the failure of US to stay dominant or even the legitimacy of staying proprietary as it may just cause us to fall behind rather than lead. All after a $700 billion investment was just announced that now just kinda looks like incompetent corporations wasting horrendous amounts of money for something they won’t even share with people, that you can’t run locally, and is surpassed by a few lean Chinese startups with barely a few million.

Replies (61)

Sounds to me like this was specifically created and released to kneecap the US AI industry in response to the chips controls. This may actually be successful too.
Almost 100% agree, but there is no way they did this without Nvidia chips. They've open sourced a ground breaking LLM scaling paradigm (RL on COT), which is no small thing believe me, but our closed source reasoning models are likely doing something similar (we just can't see it). This newly open scaling paradigm is a game changer, but you still don't get this performance without massive compute. They have illegal H100s, I'm nearly certain of it. Nvidia was probably due for a correction anyway. But it'll be funny to see what happens when we all find out they did this with Nvidia chips
to expand on that a bit. They are using secret H100s, therefore their capex claims are complete BS, therefore their API price is complete BS. CCP smuggled in our chips and is bankrolling a loss to shake the market. Pretty freaking smart tbh
Sorry I didn’t mean to say they didn’t have Nvidia chips, but more that they likely are paying higher price and it’s slightly harder to get ahold of the same amount of compute. Or at least this was the goal of the govt actions. So either: • it did nothing and they have easy access. Or, • access is slightly more difficult but it didn’t matter. My bullet point was kinda vague and implied what your interpretation was but that’s not exactly what I meant.
Do you have a good write up on the scaling paradigm? I read that in another post but couldn’t confirm it yet and wasn’t sure what that meant. Any explanation or breakdown link would be appreciatively zapped
Chris's avatar
Chris 11 months ago
They claimed to have stock pilled them before the ban. Not that they didn’t use them at all. They just used less of them. For what it’s worth.
It’s open source. We build on top of it. It should’ve been obvious for a while that open source was going to dominate, things were already trending that way. It has too much economically aligned with the early internet and how easy it is to acquire the weights vs training just makes controlling it way more costly than the returns, and nobody will build on closed source if a FOSS alternative is even comparable… which it has clearly been. And we aren’t feeding them if we just move forward, actually open up our tech (like we did with the internet, and build WITH, not against, everyone else. In that case we all get the best of all worlds and the cost for everything drops like a rock. It’s the only path that ever made sense anyway, it’s only the VC money printing lunatics who were chasing the AGI red herring and wanted govt to control it who thought otherwise
that’s kind of what I was thinking as I was watching the Nasdaq tank. My gut says this should mean cheaper better tech for them as well so was wondering why everyone is panicking
Chris's avatar
Chris 11 months ago
What they claim….FYI image
JB's avatar
JB 11 months ago
hmmm... image
R's avatar
R 11 months ago
Appreciate the summary 🚁👍🏻
Default avatar
npub1we5g...s9hy 11 months ago
“What happens to all these wonderful ChatGPT models if a small Chinese startup builds a superior LLM for $6M❔” “All your LLM models are destroyed, completely devastated, ₿itcoin goes to the Moon❕” ⤴️🌙 - Saylor to Sam Altman
I’ll catch that podcast for sure. Curious if any of their claims about what they spent can be verified? Also, any way to check that the search inputs and out outs aren’t stored somewhere. I guess if it’s fully open source that can be verified.
Just shallow market correlation. Thats all. If major equities take a short dive, so does Bitcoin in the short term. When you stretch it to a 6 month timeline or longer Bitcoin simply follows global liquidity (money printing basically). Its actually got the strongest correlation of any asset. So what you are seeing is nothing but noise due to extremely short term trading that affects all assets.
Default avatar
smalltownrifle 11 months ago
Guy Swann's avatar Guy Swann
So what’s all the hoopla about DeepSeek and why is it breaking everybody’s brain right now in Ai? I’ve been doing a dive for a couple of days and these are the main deets I’ve pulled together, will have a Guy’s Take on it soon, so stay tuned to the @npub1hw4z...lg0q feed DeepSeek ELI5: • US has been hailed as the leader in Ai, while pushing fears that we need to be closed and not share with China cuz evil CCP and they can’t figure it out without us • ChatGPT and “Open”Ai is poster child, eating up retarded amounts of capital for training and inference (using) LLMs. Estimates say around $100 million or more for ChatGPT o1 model. • In just a couple of weeks China drops numerous open source models with incredible results, Hunyuan for video, Minimax, and now DeepSeek. All open source, all insanely competitive with the premiere closed source in the US. • DeepSeek actually surpassed ChatGPT o1 on most benchmarks, particularly math, logic, and coding. • DeepSeek is also totally open with how its thought process works, it explains and shows its work as it runs, while ChatGPT makes that proprietary. This makes building with, troubleshooting, and understanding with DeepSeek much better. • DeepSeek is also multimodal, so you can give it PDFs, images, connect it to the internet, etc. it’s a literal full personal assistant with just a few tools to plug into it. • The API costs 95% LESS than ChatGPT API per call. They claim that is a profitable price as well, while OpenAi is bleeding money. • They state that DeepSeek cost only $5.6 million to train and operate. • Capital controls on GPUs and chips went into effect in the past year or two trying to prevent China from “catching up,” and it seems to have failed miserably. As it seems China was able to do 20x the results per dollar with inferior hardware. • The US model of Ai, its costs, its capes structure, and the massive demand for chips has been the model for assessing the valuation, pricing, and future demand of the entire Ai industry. DeepSeek just took a giant dump on all of it by out performing and spending a tiny fraction to achieve it while also dealing with lack of access to the newest chips. All of this together is why people are freaking out about a plummet to Nvidia price, reevaluation of OpenAi, and the failure of US to stay dominant or even the legitimacy of staying proprietary as it may just cause us to fall behind rather than lead. All after a $700 billion investment was just announced that now just kinda looks like incompetent corporations wasting horrendous amounts of money for something they won’t even share with people, that you can’t run locally, and is surpassed by a few lean Chinese startups with barely a few million.
View quoted note →
BoomTown's avatar
BoomTown 11 months ago
When is the next big print gonna be? I remember hearing Fred Thiel talk in November 2023 … he was asked about the upcoming halving and subsequent bull market and he said global liquidity was all that mattered. At the time, I thought it was a bearish (or at best conservative) comment but now - being 9 months post halving with only 40% appreciation above last cycle’s ATH - I’m wondering how accurate that assertion might have been.
I'd add one more, the USD vs BRICS. The narrative has been "yes, BRICS has all the energy and commodities, but US has the AI". We have been asked to believe that massive US-led productivity gains from AI will make the US deficit immaterial again. Deepseek shakes that narrative because if US doesn't have a lead in AI (or energy or commodities), then what does it have?
shaun's avatar
shaun 11 months ago
Great summary thanks
Default avatar
npub1nfvu...vh4n 11 months ago
DeepSeek is a refutation of the “Scaling Laws” nonsense that has plagued the AI world for the past couple of years. Instead of trying innovative new ways to improve model performance big tech has been content to throw more compute and more data at these models in order to improve performance. They were bound to run into diminishing returns at some point
I think it was cheap to train because it built/ripped from the expensively built models…🤔 what I heard….
Pertaining to "How the heck did they do this?" It was pretty obvious TBH. To use fixed point or smaller mantissa for the models is a simple optimization that is done all the time in embedded systems. This "big shock" is really due to the divide between electrical engineering and computer science.
Question: I have heard that Deepseek was so cheap to construct because they made use of Meta's open source AI models. So they basically built on top of Meta's work. Does this sound right to you?
Hasn’t a lot of AI work been built on top of the work Meta has done? Pretty sure they open sourced [at least some of] it. I may be mistaken but also through huggingface.ai was something they started…
John Smith's avatar
John Smith 11 months ago
it wouldn't be fair to call it Meta's work it's open source, so a LOT of ppl worked on it it was created by Meta, but since they made it open source Pandora's box cannot be turned back. US centralized attempt at AI failed, hope they review their approach before it burns them even more
Silicon Valley expected to evolve their investments into “AI”. The whole ecosystem of startups and banking and financing and VC firms and Angels and equity for early employees and advisors - the whole thing which has been built up for 4-5 decades now just got rug pulled. The stock market side is a legit ponzi on top of the incumbent structures. China wants the world to revolve around real world production. Manufacturing. America wants it to revolve around software protected by IP laws protected by the US military. The winner is already obvious.
Never be the first guy out the gate, you always will get shot. piggybacking is always cheaper. And there is no such thing as "hailed leader", when its software, every second someone is jockeying to take your place. Tough place to be, but only the strong survive.