Future AI bills of $100k/yr per dev

https://news.ycombinator.com/rss Hits: 3
Summary

Kilo just broke through the 1 trillion tokens a month barrier on OpenRouter for the first time. Each of the open source family of AI coding tools (Cline, Roo, Kilo) is growing rapidly this month.Part of this growth is caused by Cursor and Claude starting to throttle their users. We wrote about Cursor at the beginning of July and about Claude in the second half of July. Their throttling sent users to the open source family of AI coding tools causing the increases you see in the graphs above. Cursor and Claude needed to throttle because the industry made a flawed assumption.The industry expected that because the raw inference costs were coming down fast, the applications inference costs would come down fast as well but this assumption was wrong.Raw inference costs did decrease by 10x year-over-year. This expectation made startups bet on a business model where companies could afford to sell subscriptions at significant losses, knowing they'd achieve healthy margins as costs plummeted.Cursor's Ultra plan exemplified this approach perfectly: charge users $200 while providing at least $400 worth of tokens, essentially operating at -100% gross margin.The bet was that by the following year, the application inference would cost 90% less, creating a $160 gross profit (+80% gross margins). But this didn't happen, instead of declining the application inference costs actually grew!Application inference costs increased for two reasons: the frontier model costs per token stayed constant and the token consumption per application grew a lot. We'll first dive into the reasons for the constant token price for frontier models and end with explaining the token consumption per application.The price per token for the frontier model stayed constant because of the increasing size of models and more test-time scaling. Test time scaling, also called long thinking, is the third way to scale AI as shown in the graphic below.While the pre- and post-training scaling influenced only the training c...

First seen: 2025-08-11 18:50

Last seen: 2025-08-11 20:50