What even is a small language model now?

https://news.ycombinator.com/rss Hits: 5
Summary

What Even Is a Small Language Model Now?If you asked someone in 2018 what a "small model" was, they'd probably say something with a few million parameters that ran on a Raspberry Pi or your phone. Fast-forward to today, and we're calling 30B parameter models "small"—because they only need one GPU to run. So yeah, the definition of "small" has changed. Small Used to Mean... Actually Small Back in the early days of machine learning, a "small model" might've been a decision tree or a basic neural net that could run on a laptop CPU. Think scikit-learn, not LLMs. Then came transformers and large language models (LLMs). As these got bigger and better, anything not requiring a cluster of A100s suddenly started to feel... small by comparison. Today, small is more about how deployable the model is, not just its size on paper. Types of Small Models (By 2025 Standards) We now have two main flavors of small language models: 1. Edge-Optimized Models These are the kind of models you can run on mobile devices or edge hardware. They're optimized for speed, low memory, and offline use. Examples: Phi-3-mini (3.8B), Gemma 2B, TinyLlama (1.1B)Use cases: voice assistants, translation on phones, offline summarization, chatbots embedded in apps 2. GPU-Friendly Models These still require a GPU, but just one GPU—not a whole rack. In this category, even 30B or 70B models can qualify as "small". Examples: Meta Llama 3 70B (quantized), MPT-30BUse cases: internal RAG pipelines, chatbot endpoints, summarizers, code assistants The fact that you can now run a 70B model on a single 4090 and get decent throughput? That would've been science fiction a few years ago. Specialization: The Real Power Move One big strength of small models is that they don't need to do everything. Unlike GPT-4 or Claude that try to be general-purpose brains, small models are often narrow and optimized. That gives them a few key advantages: They stay lean — no need to carry weights for tasks they’ll never do.They’re more ac...

First seen: 2025-05-24 14:41

Last seen: 2025-05-24 18:41