This is an intervention (but mostly a rant). Maybe it's for you, or maybe it's for me—perhaps it's for both of us. Regardless, it's time we face reality and stop pretending we're already living in the year 3000. We're being conned by Big LLM, and I'm here to call them out. What is this "con" you might be wondering? The "con", as I call it, is that Big LLM is actively perpetuating the myth that running LLMs is practical and useful. This is a lie. More specifically, my claim is that Big LLM is purposefully obfuscating the cost of running LLM's, because if the people actually knew the cost, they'd riot. Well, they wouldn't riot. But they definitely stop believing all the hype surrounding AI and LLMs and especially open-source models. In fact, I'm willing to go a step further and say you can't actually run them. Now sure, you can run a quantized or lower parameter model that's supposedly jUsT aS gOod as the full model (not that the full model is very good to begin with) in your dumb-ass, AI recipe generator side project you vibe-coded last weekend. But you can't run a full model, you simply can't afford to, GPU compute is way too expensive. And, if you've made this side project I'm talking about, like I have, is it really worth $5,000/month to generate the most basic-bitch lasagna recipe you're not even gonna actually cook anyway? We both know the answer is "no." As the adage goes, show me someone who claims they're running a low parameter model effectively and I'll show you a liar. Speaking of the differences between the full models and their supposedly more economical counterparts, have you noticed that the published performance benchmarks by Big LLM are always for the full parameter versions of models? For example, look at this image by Google about its new Gemma 3 model performance: The tagline for this shiny new shit is "the most capable model you can run on a single GPU or TPU". Like other LLMs, Gemma 3 comes in a variety of quantization and parameter sizes, but t...
First seen: 2025-04-10 11:43
Last seen: 2025-04-10 11:43