The "confident idiot" problem: Why AI needs hard rules, not vibe checks

https://news.ycombinator.com/rss Hits: 8

Summary

We have all been there. You build an agent. It works perfectly in the demo. You deploy it. And then, on a Tuesday at 3 PM, it decides that the URL for the API documentation is api.stripe.com/v1/users (a 404), but it looks so plausible that you waste 20 minutes debugging network errors.Worse, it says this with 100% confidence.When we try to fix this today, the industry tells us to use “LLM-as-a-Judge.” We are told to ask GPT-4o to grade GPT-3.5. We are told to fix the “vibes.”But this creates a dangerous circular dependency. If the underlying models suffer from sycophancy (agreeing with the user) or hallucination, a Judge model often hallucinates a passing grade.We are trying to fix probability with more probability. That is a losing game.I believe we need to stop treating Agents like magic boxes and start treating them like software. Software has assertions. Software has unit tests. Software has return False.We need to re-introduce Determinism into the stack.Don’t ask an LLM if a URL is valid. It will hallucinate a 200 OK. Run requests.get().Don’t ask an LLM if a SQL query is safe. It will miss subtle injections. Parse the AST.Don’t ask an LLM if “Springfield” is ambiguous. It will guess Illinois. Check the database count.If the code says “No,” it doesn’t matter how confident the LLM is. The action is blocked.I got tired of debugging these errors by reading logs after the fact. I wanted a firewall that would catch these “Confident Idiot” moments in real-time.So I built Steer.It isn’t a heavy observability platform. It’s a simple Python library that wraps your agent functions and enforces hard guardrails.python # The “Steer” way: Hard Rules. @capture(verifiers=[ # 1. Enforce SSN Format RegexVerifier(pattern=r"^\d{3}-\d{2}-\d{4}$"), # 2. Block Markdown JsonVerifier(strict=True) ]) def update_user_profile(data): # If the LLM messes up the format, this code never runs. # The error is caught, logged, and sent to a dashboard for correction. db.update(data)The most interes...

First seen: 2025-12-08 13:25

Last seen: 2025-12-08 20:26

Read Full Article More from this Source

The "confident idiot" problem: Why AI needs hard rules, not vibe checks

Summary

Related News

SSE sucks for transporting LLM tokens

Hacking Google Chrome Source Code: Make Puppeteer work over Redis PubSub

Photographer Built a Medium-Format Rangefinder, and So Can You

Fast, Memory-Efficient Hash Table in Java: Borrowing the Best Ideas

Computer Animator and Amiga fanatic Dick Van Dyke turns 100