Claude Code vs. Codex: I built a sentiment dashboard from Reddit comments

https://news.ycombinator.com/rss Hits: 15

Summary

Most benchmarks tell us how AI coding models perform in carefully constructed scenarios. But they don’t tell us what developers actually think when they use these tools every day. That gap is why I built a Reddit sentiment analysis dashboard to see how real engineers compare Claude Code vs Codex in the wild. You can find the dashboard at https://claude-vs-codex-dashboard.vercel.app/and the source code at: https://github.com/waprin/claude-vs-codex-dashboardThere are some options to view sentiment weighted or unweighted by upvotes, and compare on specific categories like speed, problem solving, and workflows. In this newsletter edition, I’ll discuss:While notable AI benchmarks like SWEbench, PR Arena, TerminalBench, and LMArena help us navigate the landscape of the quality of AI models, I don’t think any benchmark can truly capture how most software engineers are using agentic coding models day-to-day. We don’t typically “set-it-and-forget” the agent on a constructed task but rather there’s an interactive back-and-forth conversational session. Furthermore, engineers in the wild are facing a far greater diversity of tasks than any given benchmark could hope to capture.For those reasons, I believe a survey of the “wisdom of the crowd” is valuable to gain a broader understanding of which agentic coding models are performing better. To do so, I scraped a wide variety of comments on Reddit from AI-coding focused subreddits such as /r/ChatGPTCoding, /r/ClaudeCode, and /r/Codex. I then used the Claude Haiku model to classify whether the comment directly compared Claude Code and Codex, and classified the sentiment accordingly.(note: this analysis was done before the new Haiku model that Anthropic announced yesterday) Since this post is fairly long, I’ll summarize here:Overall, Codex has much more positive sentiment than Claude Code in comments that compare the two directlyHowever, Claude Code has much more discussion overall, at about 4x the volume of Codex, raising the quest...

First seen: 2025-10-17 22:55

Last seen: 2025-10-18 12:57

Read Full Article More from this Source

Claude Code vs. Codex: I built a sentiment dashboard from Reddit comments

Summary

Related News

Formal Reasoning [pdf]

You Already Have a Git Server

ICE Will Use AI to Surveil Social Media

How I turned Zig into my favorite language to write network programs in

Resource use matters, but material footprints are a poor way to measure it