AI Model Claude 3.7 Sonnet Takes on Pokémon Red in Live Experiment
Artificial intelligence is proving its ability to navigate the digital world in unexpected ways.
The latest version of Anthropic’s Claude, known as Claude 3.7 Sonnet, is playing “Pokémon Red”, and not just pressing random buttons.
It’s actively strategising, making decisions, and adapting to challenges in real-time.
Anthropic’s AI model, streaming live on Twitch under the channel “ClaudePlaysPokemon,” has already defeated three Gym Leaders, a feat that previous Claude models struggled to achieve.
This experiment showcases how AI is advancing beyond simple task execution and into more complex problem-solving scenarios.
AI Learns to Battle, Adapt, and Overcome Obstacles
Unlike its predecessor, Claude 3.5 Sonnet, which failed to exit the player’s home in “Pokémon Red", Claude 3.7 Sonnet has shown noticeable progress.
Within hours, it defeated Brock, the first Gym Leader, and just days later, overcame Misty.
Anthropic explained that this success comes from the model’s ability to keep notes, observe the game screen, and use function calls to interact with the game.
Rather than relying on pre-trained behaviours, Claude 3.7 Sonnet processes each situation, plans ahead, and adjusts when necessary—though it’s not without its struggles.
At one point, Claude became stuck in front of a rock wall, continuously attempting to move through it.
It took time before the AI recognised an alternative route.
A user on Twitch gave a humorous take on the situation,
“Who would win, a computer AI with thousands of hours put into programming it, or 1 rock wall?”
Eventually, Claude figured out a way around the obstacle, demonstrating its capacity to learn from mistakes rather than simply repeating failed actions indefinitely.
AI Playing Video Games Is Becoming a Research Benchmark
AI models playing video games isn’t a new concept, but it remains a valuable way to test their reasoning abilities.
In March 2024, researchers used OpenAI’s ChatGPT to play the classic first-person shooter "Doom", successfully navigating to the final room of the game.
Around the same time, Google DeepMind introduced its Scalable Instructable Multiworld Agent (SIMA), capable of playing games like “No Man’s Sky”, "Teardown", and "Valheim" using only on-screen images and natural-language instructions—no access to source code or special APIs required.
Unlike simple rule-based automation, these AI models demonstrate a level of general reasoning.
Anthropic noted that “Pokémon Red” was a particularly useful test for Claude 3.7 Sonnet, as it required the model to solve puzzles and make strategic decisions rather than just responding to direct commands.
A Throwback to Twitch Plays Pokémon, but With an AI Player
For many, watching Claude play "Pokémon Red" brings back memories of "Twitch Plays Pokémon", the 2014 online social experiment where thousands of players collectively controlled the game through chat commands.
The chaotic, collaborative nature of that event turned it into a cultural phenomenon.
Now, instead of a community working together, viewers watch an AI struggle through a solo adventure.
The experience has a different feel—more observational than interactive.
Claude’s careful, step-by-step approach contrasts sharply with the erratic, crowd-driven gameplay of the original Twitch Plays Pokémon.
One particularly amusing moment occurred when Claude, while searching for Professor Oak, repeatedly interacted with the wrong NPC despite having spoken to them several times before.
Some viewers grew impatient, while others were more understanding:
“Guys chill. Before we exited and entered Oak’s lab like 10 times before understanding how to move on.”
This isn’t the first time AI has been used for such an attempt.
Back in October 2023, Seattle software engineer Peter Whidden shared a YouTube video showing how he taught a reinforcement learning AI to play Pokémon.
The AI spent more than 50,000 hours figuring out the game, but along the way, it got a little distracted—sometimes stopping just to admire the pixelated scenery instead of actually playing.
Even though Claude 3.7 Sonnet takes a slow and steady approach, its progress in "Pokémon Red" offers a glimpse into the future of AI.
It shows how models might evolve to tackle new challenges by thinking through problems one step at a time, rather than just being trained for specific tasks.