We asked four AI coding agents to rebuild Minesweeper

https://arstechnica.com/feed/ Hits: 20

Summary

Overall: 7/10 The lack of chording is a big omission, but the strong presentation and Power Mode options give this effort a passable final score. Agent 4: Google Gemini CLI Play it for yourself So… where’s the game? Credit: Benj Edwards Implementation, presentation, etc. Gemini CLI did give us a few grey boxes you can click, but the playfields are missing. While interactive troubleshooting with the agent may have fixed the issue, as a “one-shot” test, the model completely failed. Coding experience Of the four coding agents we tested, Gemini CLI gave Benj the most trouble. After developing a plan, it was very, very slow at generating any usable code (about an hour per attempt). The model seemed to get hung up attempting to manually create WAV file sound effects and insisted on requiring React external libraries and a few other overcomplicated dependencies. The result simply did not work. Benj actually bent the rules and gave Gemini a second chance, specifying that the game should use HTML5. When the model started writing code again, it also got hung up trying to make sound effects. Benj suggested using the WebAudio framework (which the other AI coding agents seemed to be able to use), but the result didn’t work, which you can see at the link above. Unlike the other models tested, Gemini CLI apparently uses a hybrid system of three different LLMs for different tasks (Gemini 2.5 Flash Lite, 2.5 Flash, and 2.5 Pro were available at the level of the Google account Benj paid for). When you’ve completed your coding session and quit the CLI interface, it gives you a readout of which model did what. In this case, it didn’t matter because the results didn’t work. But it’s worth noting that Gemini 3 coding models are available for other subscription plans that were not tested here. For that reason, this portion of the test could be considered “incomplete” for Google CLI. Overall: 0/10 (Incomplete) Final verdict OpenAI Codex wins this one on points, in no small part because it ...

First seen: 2025-12-19 18:18

Last seen: 2025-12-20 13:28

Read Full Article More from this Source

We asked four AI coding agents to rebuild Minesweeper—the results were explosive

Summary

Related News

ByteDance confirms TikTok will be controlled by US owners

The Chevrolet Equinox EV is high on comfort and convenience

Two space startups prove you don’t need to break the bank to rendezvous in space

Trump’s energy secretary orders a Washington state coal plant to remain open

Instacart agrees to refund subscribers $60 million in FTC settlement