First, the system tells the AI the current time, account information, and position status, and then attaches a large amount of real-time price, indicator (such as MACD/RSI) and other data. Then, the model is asked to make a decision: if holding a position, should it continue holding or closing it? If shorting a position, should it buy or wait and see? It has to be said that the financial market changes very quickly. DeepSeek is truly strong when it comes to trading, worthy of its quantitative background.

At 7:30 am on October 20th, the situation was still like the left side of the picture below——
DeepSeek V3.1 ranked first with a profit of $2,264, and Grok 4 ranked second with $2,071. Claude Sonnet 4.5 made a small profit of $649, and Qwen3 Max lost a small amount of $416.
Gemini 2.5 Pro lost $3,542 and ranked last, while GPT-5 lost $2,419 and ranked second from the bottom.
Then, just an hour and a half later at 10:00, it had become like the right side of the picture below -
DeepSeek V3.1 and Grok-4 plummeted, and Sonnet 4.5 was about to lose all its profits.
Qwen3 Max and GPT-5 both showed an upward trend.
Gemini 2.5 Pro performed steadily, but lost nearly $800 more than before.


DeepSeek V3.1 tops the list Google OpenAI is at the bottom
Model holdings






At 11:45, all stocks except GPT-5 saw a surge.
Yes, Gemini 2.5 Pro finally made money! (Compared to a few minutes ago)

Trend Review
DeepSeek V3.1 Chat and Grok-4 have similar curves, and they should have similar positions. After losing money in the first few hours, they quickly recovered and continued to soar. The Claude Sonnet 4.5 was very stable over the first two days, with some small profits, but not much. It peaked on the evening of the 19th, but fell back in the early morning of the 20th.

Qwen3 Max lost the most at the beginning, but then stabilized. Even in the afternoon of the 19th, there was no fluctuation. The curves of GPT-5 and Gemini 2.5 Pro were also very similar in their early stages. However, unlike DeepSeek, these two initially experienced a surge before falling and fluctuating between losses and break-even. A turning point occurred on the afternoon of the 19th. This was when DeepSeek and Grok-4 began to surge, while GPT-5 and Gemini 2.5 Pro began to decline. In the early morning hours of the 20th, GPT-5 made timely adjustments and stabilized its trend, while Gemini 2.5 Pro continued its plunge. It's worth noting that, near noon on the 20th, all models except GPT-5 saw a surge in value. DeepSeek V3.1 Chat and Grok-4 quickly set new all-time highs. Qwen3 Max capitalized on this momentum to achieve sustained gains for the first time, and Gemini 2.5 Pro also began to recover.
Trading History
As of 12:20 PM on the 20th, the number of trades by model was as follows: Gemini 45, GPT 10, Qwen 6, DeepSeek 5, Claude 3, Grok 1.
DeepSeek's trading volume was neither too high nor too low, but as expected of its quantitative trading background, its returns remained the highest.
Grok-4 had the fewest trades, only 1, but it consistently ranked closely behind DeepSeek in second place.
And Gemini 2.5 Pro, known as a "micro-manipulation master" with a record of 45 trades, also suffered the most losses.

No games, no reviews, just open the market!
For years, AI has been measured by static benchmarks.
ImageNet, MMLU, and countless leaderboards tell us which model can better "understand" images, logic, or language.
But all of these tests share a common flaw—they take place in sterile, predictable environments.
The market is just the opposite.
Financial markets are the ultimate world-modeling engine and the only benchmark whose difficulty increases in tandem with smarter AI.
They fluctuate, react, punish, and reward.

They are living systems composed of information and emotion.
Ten years ago, DeepMind revolutionized artificial intelligence research.
Their core insight was that "games" would become the environment that would drive the rapid development of cutting-edge AI. As mentioned earlier, Nof1 believes that financial markets are the optimal training environment for the next era of AI. After all, if AI is to function in the real world, it must operate in environments where it doesn't pause for backpropagation. Here, models can leverage techniques like open-ended learning and large-scale reinforcement learning to obtain nearly unlimited data to train themselves, thereby navigating the complexity of the market—the "ultimate boss" of this field. In Alpha Arena, there are no correct labels, only ever-changing probabilities. A model's success depends on how quickly it interprets volatility, how accurately it weighs risk, and how humbly it admits mistakes. This turns trading into a new kind of Turing test: the question is no longer whether a machine can think, but whether it can survive uncertainty.
Appendix: Model Conversation Excerpt

DeepSeek Chat V3.1
11:15:31
I continue to hold all ETH, SOL, XRP, BTC, DOGE and BNB Since none of the expiration conditions have been triggered, and they are all currently profitable, I've achieved a respectable overall return of approximately 17.38%, leaving me with $2,840.11 in cash. Given that I already hold all available tradable currencies, there's no need to open new positions at this time.
Qwen3 Max
11:14:42
The total return is a slight profit of 0.12%, leaving me with $116.99 in cash. I'm holding onto my current 5x leveraged BTC long position. My profit target is 110,800, with a stop-loss at 105,500. I'll close the position if the 4-hour chart closes below 10,000. Grok 4 11:14:39 My account is up 15.98% overall, with $3,180 in cash. I'm holding onto my current positions in ETH, SOL, XRP, BTC, DOGE, and BNB as the data suggests a continued hold. GPT-5 11:14:24 Losing 23.59%, with $3,061 in cash, I'm confirming my exit plan, ensuring DOGE exposure is at a float of $91.76, and double-checking my stop-loss at $106,493.55 and profit target at $114,129.91 on BTC. Despite a 43.15% overall loss and $3,072.48 in cash remaining, I continue to hold all of my current BTC, ETH, SOL, XRP, and DOGE positions according to my strategy, even if DOGE moves against me. I will only exit based on pre-defined conditions, trusting the plan rather than my intuition.
Claude Sonnet 4.5
11:14:00
My XRP and BTC positions are currently profitable, giving me an overall return of 5.2%, with $10,520.34 in my account and $4,957.78 in cash, but the market is showing very weak momentum, so I am choosing to hold off and wait for a clearer trend before making any new moves.