Grok 3 shatters AI benchmarks as Musk's xAI takes aim at OpenAI

Days after OpenAI revealed plans to ship GPT-4.5 and GPT-5, Musk’s xAI launched an early version of Grok 3, dubbed ‘Chocolate,’ which instantly took the top spot on the lmarena (formerly LMSYS) Chatbot Arena leaderboard — widely accepted as the leading ranking system for AI benchmarking.
In addition to Grok 3, xAI is looking to rival OpenAI’s attempt to usurp Google as the search king with its own AI-powered search tool, dubbed Deep Search.
Subscribe today for free
The connectivity news and insights that matter - straight to your inbox
xAI’s latest Grok model will be made available to US-based premium X subscribers and will also be accessible via a separate subscription for its web and app versions.
xAI used its mammoth Colossus supercomputer cluster in Memphis, Tennessee to power Grok 3’s training efforts — with the startup working on doubling the facility’s 100,000 Nvidia GPUs, a number that could rise to one million, according to local business leaders.
Musk said the model was the “smartest” AI model on Earth and that “if all goes well” it will be integrated into SpaceX Starship rockets going to Mars “in two years”.
Results for the initial Chocolate version of Grok 3 saw it become the first-ever AI model to achieve a score of 1,400.
For comparison, OpenAI’s GPT-4o and o1 only scored 1377 and 1353, respectively. Grok 3 toppled previously lmarena leader Gemini 2.0 Flash, which held a score of 1385.

Responding to the Grok’s performance results, Musk said: “All you need to know to understand which company will win a technology competition is look at the first and second derivatives of the rate of innovation.”
He previously said Grok 3 was due to launch in December 2024, claiming it would be on par with or even surpass the highly anticipated capabilities of OpenAI's GPT-5.
The new model was trained on a trove of data, which Musk said included court cases and legal doctrines, suggesting the model could “render extremely compelling legal verdicts”.
“Grok 3 is an order of magnitude more capable than Grok 2,” Musk said during a livestream reveal. “[The model] is maximally truth-seeking AI, even if that truth is sometimes at odds with what is politically correct.”
Musk has been gunning for xAI and its Grok models to rival OpenAI, a startup he was among the co-founders of but left after supposedly becoming disillusioned about its shift away from for-profit.
Musk is currently spearheading a consortium attempting to buy OpenAI. While their initial offer of $97.4 billion was rejected, Musk’s lawyer, Marc Toberoff said the group would raise its bid to match any others.
Days before Grok 3’s launch, OpenAI’s Sam Altman revealed its next project to ship will be GPT-4.5, with an initial version dubbed ‘Orion’ launching internally.
He revealed that OpenAI is also working to “unify” its o-series models and GPT-series models to create “systems that can use all our tools, know when to think for a long time or not, and generally be useful for a very wide range of tasks”.
“In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3,” Altman said. “We will no longer ship o3 as a standalone model.”
He also confirmed that the long-awaited GPT-5 will be made available on the free version of ChatGPT, while Plus and Pro subscribers will be able to run it at a higher level of intelligence.
RELATED STORIES
Musk’s $97.4bn bid to buy OpenAI falls short, but the fight isn’t over
Musk's xAI's Colossus Cluster set for one million GPU supercomputer expansion
Musk attacks OpenAI’s $500bn Stargate project claiming it ‘doesn’t have the money’