Musk releases Grok3: Multiple tests surpass DeepSeek and demonstrate competitiveness

2025/02/18 13:24

Source: AI Faner

xAI today released the new generation of large language model Grok-3 and its simplified version Grok-3 mini. The latest benchmark tests show that Grok-3 exhibits significant advantages in direct comparison with DeepSeek.

In the mathematics ability test (AIME'24), Grok-3 scored 52 points, significantly higher than DeepSeek-V3's 39 points. In terms of scientific knowledge assessment (GPQA), Grok-3 leads with a score of 75, while DeepSeek-V3 scores 65. In the programming ability test (LCB Oct-Feb), Grok-3 also surpassed DeepSeek-V3 with 57 points to 36 points.

In the latest AIME 2025 performance test, the Grok-3 Reasoning Beta version achieved an excellent score of 93 points in the composite score of reasoning and computing time, and its streamlined version Grok-3 mini also reached 90 points. In comparison, DeepSeek-R1 scored 75 points, while Gemini-2 Flash Thinking scored only 54 points. This result further highlights the outstanding advantages of Grok-3 in complex mathematical reasoning and computational efficiency.

It is particularly noteworthy that DeepSeek-R1, recently released by DeepSeek, also failed to surpass Grok-3 in other reasoning capability tests. In mathematical reasoning, Grok-3 scored 93 points and DeepSeek-R1 scored 73 points; in scientific reasoning, Grok-3 scored 85 points and DeepSeek-R1 scored 74 points; in programming reasoning, Grok-3 reached 79 points, while DeepSeek-R1 scored 65 points.

In addition, in the LMSYS chatbot arena evaluation, Grok-3 scored about 1,400 points, not only surpassing the DeepSeek series, but also ahead of other mainstream large models, including GPT-4, Claude, etc.

These data show that although DeepSeek has shown strong development momentum in the past few months, Grok-3's overall performance still maintains its leading position. In particular, the advantages in mathematical reasoning and computing efficiency are more obvious, which not only reflects xAI's technical strength in model research and development, but also shows the fierce competition in the AI field.

Gain a broader understanding of the crypto industry through informative reports, and engage in in-depth discussions with other like-minded authors and readers. You are welcome to join us in our growing Coinlive community:https://t.me/CoinliveSG

Add Comment

LoginLeave your comments

0 Comments

Earliest

Load more comments

Live Updates

Yesterday
Musk: DOGE dividends will not cause inflation, funds come entirely from spending cuts
Bullish
Bearish
Yesterday
The Department of Education has terminated a $226 million grant to a comprehensive center that provides DEI counseling services
Bullish
Bearish
Yesterday
Ethereum spot ETF has a total net asset value of $10.309 billion and a cumulative net inflow of $3.176 billion
Bullish
Bearish
Yesterday
The total net asset value of Bitcoin spot ETF is US$112.587 billion, with a cumulative net inflow of US$39.983 billion
Bullish
Bearish
Yesterday
On-chain trading platform Mintify will announce the MINT token economic model next week
Bullish
Bearish
Yesterday
Argentine President Javier Milei Causes Investors to Lose Over $250 Million After LIBRA Scandal
Bullish
Bearish
Yesterday
Hong Kong SFC Unveils New Virtual Asset Roadmap
Bullish
Bearish
Yesterday
The total open interest of LTC contracts on the entire network is nearly US$900 million, a record high in nearly four years
Bullish
Bearish
Yesterday
usdx.money enters the top 10 stablecoins by TVL, with BSC and Arbitrum both in the top 3
Bullish
Bearish
Yesterday
Federal Reserve might cut rates because of Trump’s tariffs
Bullish
Bearish

Musk releases Grok3: Multiple tests surpass DeepSeek and demonstrate competitiveness

Live Updates

Trending News

Ross Ulbricht's Fight for Freedom from Life Sentence Intensifies as Trump Returns to Office – Will He Honour His 2020 Pledge to Pardon the Silk Road Founder?

President-Elect Trump’s Agenda Priortises Immigration Enforcement: Will Trump Renege on His Crypto Crackdown Promise?

Is CZ’s Giggle Academy Account Restored? Binance Founder Warns Users to Verify Online Information Post X Breach

FOMO Is Driving Bitcoin's Price Surge and Bringing 230,000 New Investors

Is SEC Chair Gary Gensler Hinting at His Imminent Resignation or Will He Prolong the Inevitable by Serving Out His Term?

NFT Market Soars with a 94% Increase in Weekly Sales to $181 Million – Is the NFT Hype Back?

Doodles' Signature Rainbow Art Takes Over McCafe Cups in McDonald’s Web3 Collab

Nigeria’s SEC Imposes Stricter Penalties for Crypto Scams with Prison Time and Hefty Fines

Thala Reclaims $25.5M in Crypto From Security Breach Amidst Ongoing Security Freeze for Risk Mitigation

Is Polish Presidential Candidate Sławomir Mentzen Following Donald Trump’s Crypto Playbook to Win Votes in Poland?