Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison

https://news.ycombinator.com/rss Hits: 9
Summary

Google just launched Gemini 2.5 Pro on March 26th, claiming to be the best in coding, reasoning and overall everything. But I mostly care about how the model compares against the best available coding model, Claude 3.7 Sonnet (thinking), released at the end of February, which I have been using, and it has been a great experience. Let’s compare these two coding models and see if I need to change my favourite coding model or if Claude 3.7 still holds. TL;DR If you want to jump straight to the conclusion, I’d say go for Gemini 2.5 Pro, it’s better at coding, has one million in context window as compared to Claude’s 200k, and you can get it for free (a big plus). However, Claude’s 3.7 Sonnet is not that far behind. Though at this point there’s no point using it over Gemini 2.5 Pro. Just an article ago, Claude 3.7 Sonnet was the default answer to every model comparison, and this remained the same for quite some time. But here you go, Gemini 2.5 Pro takes the lead. Brief on Gemini 2.5 Pro Gemini 2.5 Pro, an experimental thinking model, became the talk of the town within a week of its release. Everyone’s talking about this model on Twitter (X) and YouTube. It’s trending everywhere, like seriously. The first model from Google to receive such fanfare. And it is #1 in the LMArena just like that. But what does this mean? It means that this model is killing all the other models in coding, math, Science, Image understanding, and other areas. Gemini 2.5 pro comes with a 1 million token context window, with a 2 million context window coming soon. 🤯 You can check out other folks like Theo-t3 talking about this model to get a bit more insight into it: VIDEO It is the best coding model to date, with an accuracy of about 63.8% on the SWE bench. This is definitely higher than our previous top coding model, Claude 3.7 Sonnet, which had an accuracy of about 62.3%. This is a quick demo Google shared on this model of building a dinosaur game. VIDEO Here’s a quick benchmark of this model on...

First seen: 2025-03-31 13:42

Last seen: 2025-03-31 21:43