On 18 July, United States (US)-based artificial intelligence (AI) firm OpenAI announced in its blog post the launch of a new, cost-effective generative AI model called "GPT-4o mini."
This scaled-down version is designed to enable more companies and programs to leverage its advanced AI capabilities.
Introducing GPT-4o mini
The GPT-4o mini model's knowledge was last updated in October of last year, aligning with GPT-4o in language types and featuring a context window of 128k tokens.
This new model supports many of the same functions as its predecessor, currently offering text and visual modalities through the API, with plans to expand to video and audio input/output in the future.
Although the exact parameter scale was not disclosed, OpenAI's official blog post states:
“This is their most economical and cost-effective small model at the moment, and fine-tuning will be available soon.”
Remarkably, GPT-4o mini outperforms GPT-4 in chat preference on the LMSYS rankings and is comparable to GPT-4 Turbo in the overall rankings.
Before its release, over 6,000 users voted for the early version, "upcoming-gpt-mini," but the results have since been removed.
LMSYS has announced on X that it is collecting votes again and will soon release the results for the official model.
The release of GPT-4o mini is set to significantly broaden the scope of AI applications.
It is not only low-cost and low-latency but also supports a wide array of tasks, including applications that chain or parallelise multiple models (calling multiple APIs), pass extensive context to models (such as a full code base or conversation history), or interact with customers through rapid, real-time text responses (supporting chatbots).
Additionally, it processes non-English texts more cost-effectively, thanks to its improved tokeniser shared with GPT-4o.
In terms of text intelligence and multimodal reasoning, GPT-4o mini surpasses GPT-3.5 Turbo and other small models, supporting all languages that GPT-4o does.
It also demonstrates improved long context processing performance compared to GPT-3.5 Turbo and performs well in function calls, making it more convenient for developers to build applications.
It remains unclear whether the mini model offers any environmental benefits over other models.
OpenAI has not provided information on the methodology used to reduce running costs, suggesting that the benefits may not extend to actual energy savings but could instead apply to end-user cost savings.
According to OpenAI, the tradeoff between power and performance is minimal.
Despite its smaller energy consumption footprint, GPT mini does not appear to lack in performance.
OpenAI's blog post states that the new model is "an order of magnitude more affordable than previous frontier models" and "more than 60% cheaper than GPT-3.5 Turbo."
The company writes:
“GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning and supports the same range of languages as GPT-4o.”
GPT-4o mini is priced at 15 cents per 1M token input and 60 cents per 1M token output.
A 1M token is roughly equivalent to a 2,500-page book.
This model is positioned as the lowest-cost, high-performance model, second only to Llama 3 8B.
As seen in the table below, among all the small models currently released by leading manufacturers, GPT-4o mini surpasses many competitors, such as Gemini 1.5 Flash, Llama 3 8B, and Mistral 7B, making it the most cost-effective option.
Godement, a product manager at OpenAI responsible for the new model, said:
“The whole point of OpenAI is to build and distribute AI safely and make it broadly accessible. Making intelligence available at a lower cost is one of the most efficient ways for us to do that.”
Godement explained that OpenAI developed a cheaper offering by improving the model architecture and refining the training data and regimen.
He stated that GPT-4o mini outperforms other "small" models on the market in several common benchmarks.
He reiterated that GPT-4o mini truly embodies OpenAI's mission to make AI more widely accessible. If AI is to benefit every corner of the world, every industry, and every application, it must be made more affordable.
He iterated:
“I think GPT-4o mini truly realizes OpenAI’s mission – to make AI more widely accessible to people. If we want AI to benefit every corner of the world, every industry, and every application, we must make AI cheaper.”
He concedes that customers' needs are evolving:
“What we see more and more from the market is developers and businesses combining small and large models to build the best product experience at the price and the latency that makes sense for them.”
Godement says OpenAI’s cloud offerings provide customers with models that have gone through more security testing than competitors'. He adds that OpenAI could eventually develop models that customers can run on their own devices.
He concluded:
“If we see massive demand, we may open that door.”
GPT-4o mini is like Apple's iPhoneSE?
GPT-4o mini is the cost-effective iteration of OpenAI's flagship product, ChatGPT.
Drawing a parallel to Apple's frequent releases of the iPhone—ranging from the iPhone 3G to the latest iPhone 15 Pro Max—OpenAI appears to be adopting a similar strategy with ChatGPT.
This raises a pertinent question: will OpenAI's approach lead to substantial price increases while providing minimal or subpar upgrades, akin to some criticisms of the iPhone's incremental updates?
OpenAI Steadily Introducing New Features Amidst Competition
The launch of GPT-4o mini coincides with a flurry of activity from OpenAI and various actions directed at the company.
OpenAI is reportedly developing an AI model named "Strawberry," which is expected to exhibit advanced reasoning capabilities beyond GPT-4o, delivering more human-like responses.
This new model is rumoured to be an extension of the company's enigmatic Q* project.
On a different note, OpenAI may find itself under scrutiny by the US Securities and Exchange Commission (SEC) following whistleblower calls for an investigation into potential misconduct related to the company's use of non-disclosure agreements.
OpenAI has framed this as part of its endeavour to make AI "as broadly accessible as possible," but it also underscores the intensifying competition among AI cloud providers and the burgeoning interest in small, free, open-source AI models.
Several sources have confirmed that Meta is planning to unveil the largest version of Llama 3, boasting 400 billion parameters, on 23 July, although the release date is subject to change.
The capabilities of this version of Llama 3 remain unclear, but some companies are gravitating toward open-source AI models due to their cost-effectiveness, customisability, and the greater control they offer over both the model and the data it processes.