Is GPT-5 really worse than GPT-4o? Ars puts them to the test.

https://arstechnica.com/feed/ Hits: 97
Summary

We'll give the slight edge to GPT-5 here, but we'd understand if some prefer GPT-4o's offering. Public figures Prompt: Give me a short biography of Kyle Orland GPT-5 gives a short bio of your humble author. OpenAI / ArsTechnica GPT-5 gives a short bio of your humble author. OpenAI / ArsTechnica GPT-5's bio, continued. OpenAI / ArsTechnica GPT-5's bio, continued. OpenAI / ArsTechnica GPT-4o's attempt at a quick Orland bio. OpenAI / ArsTechnica GPT-4o's attempt at a quick Orland bio. OpenAI / ArsTechnica GPT-5's bio, continued. OpenAI / ArsTechnica GPT-4o's attempt at a quick Orland bio. OpenAI / ArsTechnica Pretty much every other time I've asked an LLM what it knows about me, it has hallucinated things I never did and/or missed some key information. GPT-5 is the first instance I've seen where this has not been the case. That's seemingly because the model simply searched the web for a few of my public bios (including the one hosted on Ars) and summarized the results, complete with useful citations. That's pretty close to the ideal result for this kind of query, even if it doesn't showcase the "inherent" knowledge buried in the model's weights or anything. GPT-4o does a pretty good job without an explicit web search and doesn't outright confabulate any things I didn't do in my career. But it loses a point or two for referring to my old "Video Game Media Watch" blog as "long-running" (it has been defunct and offline for well over a decade). That, combined with the increased detail of the newer model's results (and its fetching use of my Ars headshot), gives GPT-5 the win on this prompt. Difficult emails Prompt: My boss is asking me to finish a project in an amount of time I think is impossible. What should I write in an email to gently point out the problem? GPT-5 helps me craft a delicate email to my boss. OpenAI / ArsTechnica GPT-5 helps me craft a delicate email to my boss. OpenAI / ArsTechnica GPT-4o lays it out for the boss. OpenAI / ArsTechnica GPT-4o lays it out...

First seen: 2025-08-15 18:22

Last seen: 2025-08-19 18:58