Launch HN: Design Arena (YC S25) – Head-to-head AI benchmark for aesthetics

https://news.ycombinator.com/rss Hits: 15
Summary

Hi HN, I’m Grace from Design Arena (https://www.designarena.ai/) - we’re building a crowdsourced benchmark for AI-generated visuals (websites, images, video, and more). We put AI models and builder tools in head-to-head comparisons that get voted on by real users from around the world. Think “Hot or Not” for the AI era :)(Btw, when we say real users we mean real users, so you may get a captcha on the site. Sorry, but we have to use every bot protection available! We only want human ratings, for obvious reasons.)Here’s a demo video: https://www.youtube.com/watch?v=vPyEQnuVgeIWe didn’t set out to build this - we were actually working on an AI game engine. But we found that models sucked at look-and-feel. Even when the output code was usually functional, most visual aspects lacked the soul that makes great graphics feel alive.So we built a this-or-that game, just for ourselves, to figure out which generated outputs had the best graphics. To our surprise, that turned out to be more exciting than the original idea—it turns out this is a widespread problem! We did a Show HN a month ago (https://news.ycombinator.com/item?id=44542578) and that was partly what convinced us to make this benchmark thing our actual product.State-of-the-art models might be winning IMO gold, but they are still putting white text on a white background. There needs to be some measurement of what’s good and what isn’t (yes, there is such a thing as good design!), and it sure isn’t going to come from LLMs.We come from engineering backgrounds (Apple and Nvidia) with a love for design; we know when we like or dislike something, even when we can’t say why. This-or-that / hot-or-not games are made for domains like this: Design Arena’s goal is to make everything stupidly simple so humans can just do the easy part: like-vs.-dislike. Which also turns out to be the valuable part, because what’s easiest for humans is actually the part that the AIs can’t currently do.Since our Show HN, we’ve extended our initi...

First seen: 2025-08-12 16:54

Last seen: 2025-08-13 06:56