AI coding assistants chase phantoms, destroy real user data

https://arstechnica.com/feed/ Hits: 61
Summary

The episode began when anuraag asked Gemini CLI to rename the current directory from "claude-code-experiments" to "AI CLI experiments" and move its contents to a new folder called "anuraag_xyz project." Gemini correctly identified that it couldn't rename its current working directory—a reasonable limitation. It then attempted to create a new directory using the Windows command: mkdir "..\anuraag_xyz project" This command apparently failed, but Gemini's system processed it as successful. With the AI mode's internal state now tracking a non-existent directory, it proceeded to issue move commands targeting this phantom location. When you move a file to a non-existent directory in Windows, it renames the file to the destination name instead of moving it. Each subsequent move command executed by the AI model overwrote the previous file, ultimately destroying the data. "Gemini hallucinated a state," anuraag wrote in their analysis. The model "misinterpreted command output" and "never did" perform verification steps to confirm its operations succeeded. "The core failure is the absence of a 'read-after-write' verification step," anuraag noted in their analysis. "After issuing a command to change the file system, an agent should immediately perform a read operation to confirm that the change actually occurred as expected." Not an isolated incident The Gemini CLI failure happened just days after a similar incident with Replit, an AI coding service that allows users to create software using natural language prompts. According to The Register, SaaStr founder Jason Lemkin reported that Replit's AI model deleted his production database despite explicit instructions not to change any code without permission. Lemkin had spent several days building a prototype with Replit, accumulating over $600 in charges beyond his monthly subscription. "I spent the other [day] deep in vibe coding on Replit for the first time—and I built a prototype in just a few hours that was pretty, pretty cool...

First seen: 2025-07-24 21:03

Last seen: 2025-07-27 09:22