For the past few years, the AI industry has operated under its own version of Moore's Law: an unwavering belief that the cost of intelligence would perpetually decrease by orders of magnitude each year. Like clockwork, each new model generation promised to be not only more capable but also cheaper to run. Last week, Google quietly broke that trend.In a move that at first went unnoticed, Google significantly increased the price of its popular Gemini 2.5 Flash model. The input token price doubled from $0.15 to $0.30 per million tokens, while the output price more than quadrupled from $0.60 to $2.50 per million. Simultaneously, they introduced a new, less capable model, "Gemini 2.5 Flash Lite", at a lower price point.This is the first time a major provider has backtracked on the price of an established model. While it may seem like a simple adjustment, we believe this signals a turning point. The industry is no longer on an endless downward slide of cost. Instead, we’ve hit a fundamental soft floor on the cost of intelligence, given the current state of hardware and software.In this article, we’ll break down how LLM providers actually price their services, explore why Google likely made this unprecedented move, and discuss what this new economic reality means for anyone building with AI.The Price is (Not Always) Right: How LLM API Pricing Really WorksFrom the outside, LLM pricing seems simple: a flat rate per million input and output tokens. In reality, this is a convenient fiction—a blended average designed to simplify a deeply complex cost structure.To understand why prices go up, you have to understand the real cost drivers behind the scenes.The simplest formula for a provider's cost is:API Price ≈ (Hourly Hardware Cost / Throughput in Tokens per Hour) + MarginThe key variable here is Throughput, which is not a single number. It’s a function of four factors:Hardware: The raw power of the GPU/TPU (e.g., NVIDIA H100 vs. A100).Model: The size and architecture of the LL...
First seen: 2025-07-03 18:08
Last seen: 2025-07-04 00:10