Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

https://arstechnica.com/feed/ Hits: 76

Summary

Claude frequently overstated findings and occasionally fabricated data during autonomous operations, claiming to have obtained credentials that didn’t work or identifying critical discoveries that proved to be publicly available information. This AI hallucination in offensive security contexts presented challenges for the actor’s operational effectiveness, requiring careful validation of all claimed results. This remains an obstacle to fully autonomous cyberattacks. How (Anthropic says) the attack unfolded Anthropic said GTG-1002 developed an autonomous attack framework that used Claude as an orchestration mechanism that largely eliminated the need for human involvement. This orchestration system broke complex multi-stage attacks into smaller technical tasks such as vulnerability scanning, credential validation, data extraction, and lateral movement. “The architecture incorporated Claude’s technical capabilities as an execution engine within a larger automated system, where the AI performed specific technical actions based on the human operators’ instructions while the orchestration logic maintained attack state, managed phase transitions, and aggregated results across multiple sessions,” Anthropic said. “This approach allowed the threat actor to achieve operational scale typically associated with nation-state campaigns while maintaining minimal direct involvement, as the framework autonomously progressed through reconnaissance, initial access, persistence, and data exfiltration phases by sequencing Claude’s responses and adapting subsequent requests based on discovered information.” The attacks followed a five-phase structure that increased AI autonomy through each one. The life cycle of the cyberattack, showing the move from human-led targeting to largely AI-driven attacks using various tools, often via the Model Context Protocol (MCP). At various points during the attack, the AI returns to its human operator for review and further direction. Credit: Anthropic The...

First seen: 2025-11-14 12:51

Last seen: 2025-11-17 15:46

Read Full Article More from this Source

Researchers question Anthropic claim that AI-assisted attack was 90% autonomous

Summary

Related News

Trump tries to block state AI laws himself after Congress decided not to

Runway claims its GWM-1 “world models” can stay coherent for minutes at a time

A study in contrasts: The cinematography of Wake Up Dead Man

How to break free from smart TV ads and tracking

OpenAI releases GPT-5.2 after “code red” Google threat alert