There are two types of buildings in the world. One is a market, which is open every day and built from scratch, from small to large; the other is a church, which takes several generations of hard work and decades to complete. Eric Raymond wrote in "The Cathedral and the Bazaar".
The story of Linux is more like building a cathedral in a market way. Today, in the field of generative artificial intelligence, more and more open source models are contributing new cases to such "construction mode".
Alibaba Cloud is a firm pursuer of open source models. At present, the number of Tongyi Qianwen Qwen derivative models has exceeded 100,000, surpassing the American Llama model, and Tongyi has become the world's first AI open source model.
In the early morning of April 29, Alibaba released the new generation of Tongyi Qianwen model Qwen3 (referred to as Qianwen 3), which has only one-third of the parameters of DeepSeek-R1, which is 235B, and the cost has been greatly reduced.
It is reported that Qianwen 3 is the first "hybrid reasoning model" in China. "Fast thinking" and "slow thinking" are integrated into the same model. For simple needs, it can "reply" answers in seconds with low computing power, and for complex problems, it can "deeply think" in multiple steps, which greatly saves computing power consumption.
Since 2023, the Ali Tongyi team has open-sourced more than 200 models, including two major basic model series, the large language model Qianwen Qwen and the visual generation model Wanxiang Wan. The open source covers all modalities such as text generation models, visual understanding/generation models, speech understanding/generation models, text and video models, covering all-size parameters from small to large to meet different terminal needs.
Qianwen 3 has a total parameter size of 235B, and only 22B is needed for activation. The pre-training data size of Qianwen 3 reaches 36T, and multiple rounds of reinforcement learning are used in the post-training stage to seamlessly integrate non-thinking modes into thinking models.
The deployment cost of Qianwen 3 has also dropped significantly. Only 4 H20s are needed to deploy the full version of Qianwen 3, and the memory usage is only one-third of that of models with similar performance.
What does Alibaba's open source model release mean to the industry? What are the capabilities of open source models? Where will the future competition for large models go?
01 The capabilities of open source large models are catching up
The capabilities of open source large models are catching up with closed source models.
This is the consensus after the author asked many AI entrepreneurs, large model developers and investors.
Although they also agree that closed source models are still in the leading position, the gap between open source models and closed source models is gradually narrowing, and this speed is beyond the industry's expectations.
"The closed source model first achieved 90 points, but now the open source model can also achieve 90 points." A large model developer said. Scaling Law always has bottlenecks. This bottleneck reflects that the larger the model, the more capabilities and costs will increase exponentially, so it gives open source models time to catch up.
What exactly does the open source model open up? What is the difference between it and open source software? And where is the difference from the closed source model?
Open source software usually discloses the entire source code, allowing developers to view and modify it. Subsequent developers can easily reproduce the corresponding functional implementation based on the code. However, open source models generally only open source parameters. As for what data is used, how to fine-tune, and how to align, it is difficult to know. The closed source model directly provides a complete set of solutions. It can be understood that the open source model is based on the original materials, and the chef needs to prepare his own tools, menus, and research methods, but whether he can make a good dish depends entirely on the chef's skills. The closed source model is a pre-made dish that can be heated and used.
But the advantage of the open source model is thatit allows more developers to participate in the development of the model, helping the model to improve performance, improve the ecosystem, and has strong flexibility. This can help model companies save a lot of manpower and time costs. For the party using the open source model, it is also a way to save costs.
However, the cost advantage of the open source model is in the early stage. For example, it is calculated that the closed source model GPT-4 costs about $10 per million tokens input and about $30 per million tokens output, while the open source model Llama-3-70-B costs about 60 cents per million tokens input and about 70 cents per million tokens output, which makes it about 10 times cheaper, while the performance difference is small. But if it comes to subsequent deployment, it requires extremely strong technical strength and investment.
However,Ali's newly released Qianwen 3 is also gradually solving the problem of cost investment. Taking Ali's newly released Qianwen 3 as an example, from the deployment cost point of view, Qianwen 3 is 25% to 35% of the full-blooded version of R1, and the model deployment cost has been greatly reduced by 60% to 70%. The total parameters of the flagship Qianwen 3 model are 235B and 22B are activated, which roughly requires 4 H20 or GPUs of equivalent performance. In comparison, the full-blooded version of DeepSeek-R1 has a total parameter of 671B and 37B activated. Although an 8-card H20 can run, it is tight (about 1 million yuan). Generally, 16-card H20 is recommended, with a total price of about 2 million yuan.
From the perspective of model reasoning, Qianwen 3's unique hybrid reasoning model allows developers to set their own "thinking budget" to achieve more refined thinking control while meeting performance requirements, which will naturally save overall reasoning costs. For reference, the price difference between the reasoning and non-reasoning modes of the same type of Gemini-2.5-Flash is about 6 times, and users can save 600% of computing power costs when using the non-reasoning mode.
A developer of large models at a large company told Silicon Rabbit that Open source models are more suitable for teams with strong technical strength but insufficient budgets, such as academic institutions. Closed source models are suitable for companies with few people and more money. However, with the improvement of open source model capabilities, 41% of the surveyed companies plan to increase the use of open source models, and 41% of companies believe that if the performance of open source models and closed source models is comparable, they will turn to open source models. In this survey, only 18% of companies do not plan to increase the use of open source LLM.
A16z founder Marc Andreessen said that open source allows universities to return to the competition, because if researchers are worried that, first, universities do not have enough funds to participate in the competition in the field of AI and remain relevant; second, all universities combined do not have enough funds to participate in the competition because no one can keep up with the fundraising capabilities of these large companies. When there are more and more open source models and their capabilities improve, it means that universities can use open source models for research. This logic also applies to small companies that do not have enough funds.
Graphic by Silicon Rabbit
02 Eastern Inspiration of Big Models
The emergence of DeepSeek has made many people discover the ability of Chinese companies to open source models.
"Deep Seek represents lightweight, low-cost AI products." A Chinese and American AI investor said that, for example, the adjustment of the mixed expert model (MoE) requires extremely high craftsmanship. In the past, not many mainstream models used MoE because it was difficult, but "the child did not believe in evil" and made it happen.
But the most important thing about open source models is the ecosystem, that is, how many people use it. After all, switching to different models is extremely costly for users. However, when DeepSeek came out, some users in Silicon Valley who used Meta's large models also switched to DeepSeek. "The latecomer must have enough advantages over the first mover." A large model developer said that this will attract users to give up the cost of the initial investment and switch to the new open source model. Silicon Rabbit has sorted out the open source and closed source status of the world's well-known models and found that, in addition to Amazon, Microsoft, Google, Meta, and OpenAI all have open source models. Some companies choose the pure open source route, while others choose open source and closed source at the same time. In China, Alibaba is the most determined company on the open source path. As early as before DeepSeek released R1, Alibaba had bet on and laid out open source models.
Open Source of World-renowned Models
Source: Public information compiled by Silicon Rabbit
Qianwen 3, released on the 29th, is the latest generation of large language models in the Tongyi Qianwen series, providing a series of dense and mixed expert (MoE) models. It has made breakthrough progress in reasoning, command following, agent capabilities and multi-language support, and has the following features:
1) Unique hybrid reasoning: supports seamless switching between thinking mode (for complex logical reasoning, mathematics and coding) and non-thinking mode (for efficient general conversations), ensuring optimal performance in various scenarios.
2) Significantly enhanced reasoning capabilities: It surpasses the previous QwQ (in thinking mode) and Qwen2.5-Instruct instruction model (in non-thinking mode) in mathematics, code generation, and common sense logical reasoning.
3) Better alignment with human preferences: It performs well in creative writing, role-playing, multi-round dialogue, and instruction following, providing a more natural, engaging, and immersive conversation experience.
4) Outstanding agent capabilities: It can accurately integrate external tools in thinking and non-thinking modes, and leads the open source models in complex agent-based tasks.
5) Powerful multi-lingual capabilities: It supports 119 languages and dialects, and has powerful multi-lingual instruction following and translation capabilities.
The "hybrid reasoning" mentioned here is equivalent to integrating the top reasoning models and non-reasoning models into the same model, which requires extremely sophisticated and innovative design and training. At present, among the popular models, only Qianwen 3, Claude3.7 and Gemini 2.5 Flash can do this.
Specifically, in the "reasoning mode", the model will perform more intermediate steps, such as decomposing the problem, deducing step by step, verifying the answer, etc., to give a more thoughtful answer; while in the "non-reasoning mode", the model will directly generate the answer. The same model can complete "fast thinking" and "slow thinking", which is similar to humans answering simple questions quickly based on experience or intuition, and then thinking carefully and giving answers when facing complex problems. Qianwen 3 can also set the "thinking budget" (i.e. the expected maximum number of thinking tokens) through the API, and perform different levels of thinking, so that the model can achieve a better balance between performance and cost to meet the diverse needs of developers and institutions.
Performance of Qwen3
For China, the open source model can also attract more customers than the closed source model, because if it is a closed source model, it can only be more concentrated in the domestic market, but open source can allow more foreign companies to use it. For example: Perplexityis an American company, but users can DeepSeek R1 is used on Perplexity and is hosted entirely in the United States, using US data centers.
03 BigThe second half of the model
In March 2023, at an open source AI event at the Exploratorium in San Francisco, alpacas strolled around the venue and paid tribute to Meta's open source large language model "LLaMA".
From 2023 to now, in more than a year, generative AI has been changing. The public's focus has shifted from basic models to AI-native applications. In the Demo Day of YC W25, 80% of the projects were AI applications.
"Open source models will promote the implementation of more agents."Many industry insiders expressed this view to Silicon Rabbit. On the one hand, open source will reduce the cost and threshold of use.
For example, Qianwen 3 has a strong tool calling capability. In the Berkeley function call BFCL evaluation list, Qianwen 3 set a new high of 70.76, which will greatly reduce the threshold for Agent to call tools. At the same time, the Qwen-Agent open source framework can be combined to fully realize the intelligent agent capabilities of Qwen3. Qwen-Agent is a framework for developing LLM applications based on Qwen's instruction tracing, tool usage, planning, and memory functions. The framework encapsulates tool call templates and tool call parsers, and also comes with sample applications such as browser assistants, code interpreters, and custom assistants, which greatly reduces the complexity of coding. Qianwen 3 natively supports the MCP protocol. Developers who want to define available tools can use Qwen-Agent's integrated tools or integrate other tools based on the MCP configuration file to quickly develop an intelligent agent with settings, knowledge base RAGs, and tool usage capabilities.
Not only that,Ali's Qianwen 3 can support models of different sizes. Qianwen 3 is more friendly to the deployment of smart devices and scenarios such as mobile phones, smart glasses, smart driving, and humanoid robots. All companies can download and commercialize the Qianwen 3 series of models for free, which will greatly accelerate the application of AI large models on terminals.
In addition, some practitioners pointed out that closed-source models do not solve the trust problem well on the To B side. Many large companies are actually unwilling to connect their businesses to the API of third-party large models, because behind this is whether the core data will become part of the training of third-party large models. This is also an opportunity for open source models.
There is a saying that open source is a marketing strategy for early products before beta testing. When you don't know what tomorrow will be like, open source it first to attract developers. When someone uses it, there will be best practices, and then you will build your own ecosystem.
However, since the business chain of the open source model is longer, it is not as fast and clear as the closed source model. Therefore, industry insiders said that the open source model is more suitable for the games of the "rich second generation" who have money and resources at home.
. For Meta, Meta's open source model is more about building an ecosystem and providing support for Meta's other business segments. Alibaba's logic of open source is more for its cloud services. Alibaba has strong cloud infrastructure services, on which it can train large models. In addition, it can also deploy large models on its own cloud service providers, and even customize exclusive large models according to user deployment, so as to implement business logic in this way.
"My model is to let large companies, small companies and open source compete with each other. This is what happened in the computer industry." Mark Andreessen once said. As large models gradually become standardized products like water, electricity and coal, open source may be more suitable for the future direction.
Preview
Gain a broader understanding of the crypto industry through informative reports, and engage in in-depth discussions with other like-minded authors and readers. You are welcome to join us in our growing Coinlive community:https://t.me/CoinliveSG