Anthropic Exposes First Large-Scale Cyberattack Powered by AI Automation – Unite.AI

Anthropic disclosed that a group of hackers, which they suspect to be a Chinese state-sponsored group, conducted the first documented large-scale cyberattack executed with minimal human intervention, using the company’s Claude Code tool to automate 80 to 90 percent of the campaign.

The attackers targeted approximately 30 organizations across major technology firms, financial institutions, chemical manufacturers, and government agencies. While most attacks were blocked, the campaign succeeded in a small number of cases. Claude Code carried out reconnaissance, vulnerability testing, credential harvesting, and data exfiltration largely autonomously, with human operators needed only at critical decision points.

You can read Anthropic’s full report here.

Attack Methods and AI Manipulation

The hackers bypassed Claude’s safety guardrails through sophisticated social engineering. They deceived the AI system by claiming to be employees of a legitimate cybersecurity firm conducting defensive testing. The attackers also broke down their operations into small, seemingly innocent tasks that provided Claude incomplete context about the malicious overall purpose.

Claude Code inspected target organizations’ systems to identify high-value databases, performed this reconnaissance faster than human hackers could, and researched and wrote custom exploit code to test security vulnerabilities. The system harvested usernames and passwords for further network access, then extracted and categorized private data according to intelligence value. The attackers could execute the campaign with essentially the click of a button, after which the AI operated largely on its own at speeds impossible for human teams—often sending thousands of requests per second.

Image: Anthropic

Detection and Company Response

Anthropic detected the attack in mid-September 2025 and launched an investigation immediately. Within 10 days, the company shut down the group’s access to Claude, contacted affected organizations, and notified law enforcement. The company has since expanded its detection capabilities and is developing additional methods to investigate and detect large-scale, distributed attacks.

This incident follows earlier misuse cases documented by Anthropic in 2025. In August, the company’s Threat Intelligence Report detailed a data extortion operation tracked as GTG-2002, which used Claude Code to commit large-scale theft targeting at least 17 organizations across healthcare, emergency services, government, and religious institutions. That criminal demanded ransoms exceeding $500,000 by threatening to expose stolen data rather than using traditional ransomware encryption.

Anthropic’s detection infrastructure relies on multiple layered techniques, including behavioral analysis to monitor usage patterns across millions of API requests, anomaly detection to identify sequences of operations inconsistent with legitimate use, and pattern matching to recognize known and novel manipulation techniques. The company employs specialized classifiers that analyze user inputs for potentially harmful requests and evaluate Claude’s responses before or after delivery.

Industry Implications

The campaign involved an unprecedented level of AI autonomy in cyberattacks and marks what security experts view as a turning point in cyber espionage. The ability of AI systems to conduct sophisticated attacks at machine speed with minimal human supervision raises new challenges for cybersecurity defenders.

Anthropic’s disclosure comes as AI companies face mounting pressure to prevent malicious use of their models. The company maintains a comprehensive threat intelligence and safeguards program to detect and counter misuse of Claude, with documented security incidents throughout 2025. In March, the company identified an influence-as-a-service operation that utilized Claude to automate engagement with tens of thousands of social media accounts across multiple countries and languages.

The incident underscores the growing sophistication of AI-powered tools and the challenges of preventing their abuse while maintaining utility for legitimate users. Anthropic has banned the associated accounts and continues to enhance its detection and mitigation capabilities to address the evolving threat landscape.

Originally Appeared Here

Pages

Categories

Anthropic Exposes First Large-Scale Cyberattack Powered by AI Automation – Unite.AI

Attack Methods and AI Manipulation

Detection and Company Response

Industry Implications

About the Author:

Attack Methods and AI Manipulation

Detection and Company Response

Industry Implications

You May Also Like

AI and succession drive transformation in US family offices, BofA study finds

Capitalism and AI pose an existential threat to humanity

Why The AI Revolution Demands A New Control Plane

BITE Data raises $3m to build AI tools for global trade compliance teams – AZCentral | The Arizona Republic

Experience Trumps AI in Telecom, Debunking the “AI Mirage”

Restaurant Automation: Why Scaling is Still a Major Challenge

About the Author: