Source: Overseas Unicorns
We have been talking about the second half of AI since 24Q3. Although OpenAI o1 proposed the RL narrative, it did not break the circle for various reasons. DeepSeek R1 solved the puzzle of RL and pushed the entire industry into a new paradigm, truly entering the second half of intelligence.
There have been many discussions on what DeepSeek is and why in the market. The next more valuable discussion is how to play the AI race? I have summarized my thoughts over the past half month, hoping to become a road map for exploring the second half, and review it every once in a while. I have also listed a few of the most curious questions. You are welcome to fill out the questionnaire to exchange ideas. We will organize a small-scale discussion around the next Aha moment of intelligent breakthrough:
•Where will the next Aha moment of intelligent breakthrough appear?
•If you have sufficient exploration resources, in what direction will you invest your exploration resources?
• For example, the next generation Transformer architecture, breakthroughs in synthetic data, and more efficient learning methods of Online Learning, what bets would you make?
Insight 01 Has DeepSeek surpassed OpenAI?
There is no doubt that DeepSeek has surpassed Meta Llama, but it is still far away from the first-tier players such as OpenAI, Anthropic and Google. For example, Gemini 2.0 Flash has a lower cost than DeepSeek, is also very powerful, and is full-modal. The outside world underestimates the capabilities of the first-tier represented by Gemini 2.0, but it has not been open sourced to achieve such a sensational effect.
DeepSeek is very exciting, but it cannot be called a paradigm-level innovation. A more accurate description is that it has opened up the paradigm that was half-hidden by OpenAI o1 before, pushing the entire ecosystem to a very high penetration rate.
From the perspective of first principles, it is difficult to surpass the first-tier model manufacturers under the Transformer generation architecture. It is difficult to achieve overtaking on the same path. Today, we are looking forward to someone exploring the next generation of intelligent architecture and paradigm.
![](https://img.jinse.cn/7347490_image3.png)
DeepSeek caught up with OpenAI and Anthropic in one year.
Insight 02 Has DeepSeek opened a new paradigm?
As mentioned earlier, strictly speaking, DeepSeek did not invent a new paradigm.
But the significance of DeepSeek is to help the new paradigm of RL and test time compute really go viral. If OpenAI's initial release of o1 was a riddle for the industry, DeepSeek was the first to publicly solve it.
Before DeepSeek released R1 and R1-zero, only a small number of people in the industry were practicing RL and reasoning models, but DeepSeek pointed out a roadmap for everyone, making the industry believe that doing so can really improve intelligence, which is of great help in boosting confidence and attracting more AI researchers to turn to new paradigms of research.
Only with the entry of talents can there be algorithm innovation, and only with the close pursuit of open source can more computing resources be invested. After DeepSeek, OpenAI, which originally planned not to release new models, released o3mini one after another, and plans to continue to release o3, and is also considering open source models. Anthropic and Google will also accelerate RL research. The industry's advancement of new paradigms has been accelerated because of DeepSeek, and small and medium-sized teams can also try RL in different domains.
In addition, the improvement of the reasoning model will further help the agent to land. AI researchers are now more confident in the research and exploration of agents. Therefore, it can be said that DeepSeek's open source reasoning model has promoted the industry's further exploration of agents.
So although DeepSeek did not invent a new paradigm, it has promoted the entire industry into a new paradigm.
Insight 03 What is the difference between Anthropic's technical route and R1?
From Dario's interview, it can be seen that Anthropic's understanding of R-1/reasoning model is somewhat different from the O series. Dario thinks that the base model and reasoning model should be a continuous spectrum, not an independent model series like OpenAI. If you only do the O series, you will soon encounter a ceiling.
I have always wondered why Sonnet 3.5's coding, reasoning and agentic capabilities have improved so much, but 4o has not caught up?
They did a lot of RL work in the pre-training base model stage. The core is to improve the base model, otherwise it may be easy to eat up the benefits by only relying on RL to improve the reasoning model.
Insight 04DeepSeek's sensation is inevitable and accidental
"Why Greatness Can't Be Planned" written by two early OpenAI researchers is also very suitable to describe DeepSeek.
From a technical perspective, DeepSeek has the following highlights:
• Open source:Open source is very important. After OpenAI turned into a closed source company starting with GPT-3, the top three giants no longer disclosed technical details, leaving a blank open source ecological niche, but Meta and Mistral did not take this position firmly. DeepSeek's surprise attack this time is a smooth ride in the open source track.
If the sensation is scored 100 points, the contribution of intelligence improvement is 30 points, and the contribution of open source is 70 points. LLaMA was also open source before, but it did not have such a sensational effect, which shows that LLaMa's intelligence level is not enough.
• Cheap:The value of the phrase "Your margin is my opportunity" is still rising.
• Networking + Public CoT:These two points can bring a good user experience to users. DeepSeek played both cards at the same time, which can be said to be a king bomb, giving C-end users a completely different experience from other Chatbots. Especially the transparency of CoT, which makes the model thinking process public. Transparency can make users more trusting of AI and promote breaking the circle. However, it stands to reason that Perplexity also has a great impact, but the DeepSeek server is unstable. The Perplexity team responded quickly and launched R-1, which took over a large number of DeepSeek R-1 overflow users.
• RL generalization:Although RL was first proposed by OpenAI o1, it has been half-hidden due to various operations, and the penetration rate is not high. DeepSeek R-1 has greatly promoted the progress of the reasoning model paradigm, and the ecological acceptance has been greatly improved.
DeepSeek’s investment in technological exploration is a deterministic factor that makes this intelligent achievement worthy of more attention and discussion, but the timing of the launch of DeepSeek R1 makes this sensation accidental:
• In the past, the United States has always said that it is far ahead in basic technology research, but DeepSeek is native to China, which is also a highlight in itself. In this process, many American technology giants began to publicize the argument that DeepSeek has challenged the status of the United States as a technological hegemon. DeepSeek was passively involved in the public opinion war;
• Before the release of DeepSeek R1, the OpenAI Stargate $500B incident had just begun to ferment. The contrast between this huge investment and the intelligent output efficiency of the DeepSeek team was too obvious, and it was difficult not to attract attention and discussion;
• DeepSeek caused Nvidia’s stock price to plummet, which further fermented public opinion. They certainly didn’t expect that they would become the 2025 The first black swan in the US stock market at the beginning of the year;
• The Spring Festival is a training ground for products. Many super apps in the mobile Internet era have exploded during the Spring Festival, and the AI era is no exception. DeepSeek R1 was released just before the Spring Festival. What surprised the public was its writing ability, rather than the coding and math ability emphasized during training. Cultural creation is easier for the general public to feel and easier to go viral.
Insight 05 Who is hurt? Who benefits?
The players in this arena can be divided into three categories: ToC, To Developer and To Enterprise (to Government):
1. ToC: Chatbot is definitely the most affected, as its mind and brand attention have been snatched away by DeepSeek, and ChatGPT is no exception;
2. The impact on developers is very limited. We have seen users comment that r1 is not as good as sonnet after using it. Cursor officials also said that Sonnet is still an outperform. Surprisingly, a high proportion of users choose Sonnet and there is no large-scale migration;
3. The third dimension is that the business of To Enterprise and To Government is based on trust and understanding of needs. The interests of large organizations in making decisions are very complex and will not be as easy to migrate as C-end users.
From another angle, think about this issue from the perspectives of closed source, open source and computing power:
In the short term, everyone will feel that the closed-source OpenAI/Anthropic/Google will be more impacted:
• The mystery of technology has been open sourced, and the most important premium of mystery in the AI hype has been broken;
• More realistic factors, the market believes that some of the potential customers and market size of these closed-source companies have been snatched away, and the payback period of GPU investment has become longer;
• As the leader, OpenAI is the one that suffers the most. Its previous dream of keeping the technology secret and not open source, hoping to earn more technology premium, has now failed.
But in the medium and long term, companies with abundant GPU resources will still benefit. On the one hand, Meta, the second-tier company, can quickly follow up on new methods, and Capex is more efficient. Meta may be a big beneficiary. On the other hand, more exploration is needed to improve intelligence. DeepSeek open source has brought everyone to the same level, and entering a new exploration requires 10 times or even more GPU investment.
From the first principle, for the AI intelligent industry, whether it is developing intelligence or applying intelligence, it is inevitable to consume massive computing power from the physical nature. This is determined by the basic law and cannot be completely avoided by technical optimization.
Therefore, whether it is exploring intelligence or applying intelligence, even if there are doubts in the short term, the demand for computing power in the medium and long term will explode. This also explains why Musk started from the first principle and xAI insisted on expanding the cluster. The deep logic behind xAI and Stargate may be the same. Amazon and other cloud vendors have announced that they will increase Capex guidance.
Let's assume that the level of AI research talent and cognition around the world are all on par. With more GPUs, we can do more experimental exploration? In the end, it may still return to the competition of compute.
DeepSeek is not afraid of wearing shoes when barefoot. It has no commercial appeal and focuses on exploring AGI intelligent technology. The open source action is of great significance to promoting the progress of AGI, intensifying competition and promoting openness, which has a catfish effect.
Insight 06Can distillation surpass SOTA?
There is a detail that is uncertain. If DeepSeek has used a large amount of distilled CoT data since the pre-train stage, the effect today is not amazing. It is still the basic intelligence obtained on the shoulders of the first-tier giants, and then open sourced; but if there is no large amount of distilled data in the pre-train stage, DeepSeek has done pre-training from 0 to achieve today's effect, which is amazing.
In addition, it is unlikely that distillation can surpass SOTA in base model. But DeepSeek R-1 is very strong. It is speculated that the reward model is very good. If the path of R-1 Zero is reliable, there is a chance to surpass SOTA. ,
Insight07 No Moat!
Google's previous evaluation of OpenAI: No Moat! This sentence is also very appropriate here.
![](https://img.jinse.cn/7347491_image3.png)
This wave of DeepSeek Chatbot users has seen a large number of migrations, which has given the market an important inspiration: the progress of intelligent technology is very steep, and it is difficult for phased products to form an absolute barrier.
Whether ChatGPT/Sonnet/Perplexity has just formed a mindset and reputation, or developer tools such as Cursor and Windsurf, once there are smarter products, users have no loyalty to the "previous generation" of smart products. Today, it is difficult to build a moat at both the model layer and the application layer.
DeepSeek also verified one thing this time: the model is the application. DeepSeek has no innovation in product form. The core is intelligence + open source. I can't help but think: In the AI era, is any product and business model innovation inferior to intelligent innovation?
Insight 08Should DeepSeek take on this wave of Chatbot traffic and expand it?
From the explosion of Chatbot to today, it can be clearly felt through the reaction of the DeepSeek team that DeepSeek has not figured out how to use this wave of traffic.
The essence of the question of whether to accept and actively operate this batch of traffic is, can a great business company and a great research lab coexist in one organization?
This matter is a great test of energy and resource allocation, organizational ability and strategic choice. If it is a large company such as ByteDance and Meta, their first reaction should be to accept it, and there is a certain organizational foundation to accept it, but as a research lab organization, DeepSeek must be under great pressure to accept this huge wave of traffic.
But at the same time, we must also think about whether this wave of Chatbot will be a phased traffic? Is Chatbot under the main line of future intelligent exploration? It seems that each intelligent stage has a corresponding product form, and Chatbot is just one of the early forms unlocked.
For DeepSeek, from the perspective of the next 3-5 years, if we do not accept Chatbot traffic today, will it be a miss? What if there is a scale effect one day? If AGI is finally realized, what carrier will take it?
Insight 09 Where will the next Aha moment of intelligent breakthrough come from?
On the one hand, the next generation model of the first echelon is critical, but today we are at the limit of Transformer, and it is uncertain whether the first echelon can come up with a generational improvement model. In response, OpenAI, Anthropic and Google issued a model that is 30-50% better, which may not be enough to save the situation because they have 10-30 times more resources.
On the other hand, the landing of Agent is more critical, because Agent needs to do long-distance multi-step reasoning. If the model is 5-10% better, the leading effect will be magnified many times. Therefore, OpenAI, Anthropic and Google must first land Agent products, full stack integrated model + Agent products, just like Windows + Office, and secondly, they must also show more powerful models, such as the full version of O3, Sonnet 4/3.5 opus and other next-generation models.
Under technological uncertainty, the most valuable are talented AI researchers. Any organization that wants to explore AGI must invest resources in a more radical bet on the next paradigm, especially in the context of today's pre-training stage has been pulled together, there must be good talent + sufficient resources to explore the next Aha moment of intelligence emergence.
Insight 10 DeepSeekThis wave of developments has made memore confidentinChina’s AI talent. It’s very encouraging
Finally, I hope technology will be borderless.