Anthropic's AI Discovers 22 Firefox Security Flaws in Groundbreaking Red Teaming Effort

AI-Powered Vulnerability Research Reaches a New Milestone

In a landmark demonstration of AI's evolving capabilities in cybersecurity, Anthropic's Claude Opus 4.6 identified 22 security vulnerabilities in Mozilla Firefox during a two-week collaborative red teaming effort. Of these, Mozilla classified 14 as high-severity flaws, representing nearly a fifth of all high-severity vulnerabilities remediated in the browser throughout 2025. This partnership signals a dramatic acceleration in the vulnerability discovery process, even for one of the world's most scrutinized open-source projects.

The findings, detailed by Anthropic and corroborated by Mozilla, show AI models transitioning from theoretical security tools to practical, high-output bug hunters. "AI is making it possible to detect severe security vulnerabilities at highly accelerated speeds," Anthropic stated. Mozilla has already shipped fixes for most of these issues in Firefox 148.0, released in late February 2026, protecting hundreds of millions of users.

From Benchmark to Browser: Testing AI on Complex Code

Anthropic's initiative began in late 2025 as an effort to construct a more realistic evaluation for its models beyond standard benchmarks. After noting Claude Opus 4.5 was nearing mastery of the CyberGym benchmark, the team built a dataset of historical Firefox Common Vulnerabilities and Exposures (CVEs). They chose Firefox specifically because its complexity and rigorous security testing presented a formidable challenge.

"Firefox is both a complex codebase and one of the most well-tested and secure open-source projects in the world," explained Logan Graham, head of Anthropic's frontier red team, in a statement to Axios. This made it a harder test than previous open-source software targets. The team first validated Claude's ability to reproduce known historical CVEs in older code.

To rule out the possibility that these bugs were memorized from training data, the critical next step was tasking Claude with finding novel, previously unreported vulnerabilities in the *current* version of Firefox. The team started with the JavaScript engine, a critical component with a wide attack surface, as it processes untrusted code from the web.

Rapid Discovery and a Scaling Partnership

The results were startlingly fast. Within just twenty minutes, Claude Opus 4.6 reported a Use After Free memory vulnerability in the JavaScript engine. After internal validation by multiple researchers, Anthropic submitted its first bug report to Mozilla's Bugzilla tracker, complete with a description and a Claude-authored proposed patch.

"By the time we validated and submitted this first vulnerability... Claude had already discovered fifty more unique crashing inputs," Anthropic noted. A Mozilla researcher soon reached out, leading to a pivotal shift in process. Mozilla encouraged Anthropic to submit all findings in bulk, even unvalidated crash reports, to streamline triage.

This collaborative approach proved highly effective. Ultimately, Anthropic scanned nearly 6,000 C++ files and submitted 112 unique reports over the two-week period. Mozilla mobilized engineering teams in what Chris Grinstead of Mozilla described to Axios as an "incident response" to triage and patch the influx. Of the 112 reports, 22 were security-sensitive bugs warranting CVEs, while the other ~90 involved non-security crashes or logic errors.

continue reading below...

Pushing Limits: From Finding Bugs to Writing Exploits

Anthropic didn't stop at discovery. In a separate, more concerning evaluation, the team tested whether Claude could develop functional exploits for the vulnerabilities it found. The goal was to measure the upper limits of AI's offensive cybersecurity abilities. Researchers gave Claude access to the submitted bugs and tasked it with creating proof-of-concept exploits capable of reading or writing a local file on a target system.

Despite running this test hundreds of times at a cost of approximately $4,000 in API credits, Opus 4.6 succeeded in crafting a crude, working exploit in only two instances. This reveals a crucial insight: Current frontier models are significantly better at finding vulnerabilities than exploiting them. The cost of discovery is also an order of magnitude lower than the cost of developing a functional attack.

However, the fact that Claude succeeded at all in automating exploit development is a warning sign. Anthropic emphasized these were "crude" exploits that worked only in a testing environment with certain browser security features, like the sandbox, intentionally disabled. "Firefox’s 'defense in depth' would have been effective at mitigating these particular exploits," the company stated, but the capability represents a necessary component of a full-chain attack.

Best Practices for an AI-Augmented Security Workflow

The collaboration yielded procedural lessons for integrating AI into the vulnerability disclosure process. Anthropic highlighted the importance of "task verifiers"—trusted tools that allow an AI agent to check its own work in real-time. For bug hunting, this meant automated tools to test if a code input triggers a crash. For "patching agents," verifiers must confirm a fix removes the vulnerability without breaking existing functionality.

Mozilla identified three key components that built trust in Anthropic's AI-generated reports: clear evidence of the bug's trigger, demonstration of its security impact, and reproducibility. Anthropic encourages other researchers using LLM-powered tools to include similar evidence. The company has also published its Coordinated Vulnerability Disclosure principles, adhering to current industry norms while acknowledging these may need to evolve with AI capabilities.

The Defender's Window and What Comes Next

This case study arrives alongside Anthropic's rollout of Claude Code Security, an automated code security testing tool now in limited research preview. The tool's announcement briefly rattled cybersecurity stocks, according to Axios, underscoring the market's recognition of AI's disruptive potential in security. The Mozilla partnership provides a model for how open-source maintainers can handle the increased volume and plausibility of AI-generated bug reports.

Anthropic stresses that, for now, AI gives defenders a distinct advantage. "Opus 4.6 is currently far better at identifying and fixing vulnerabilities than at exploiting them," the company concluded. However, they warn the gap between discovery and exploitation abilities is unlikely to last long given the rate of AI progress.

The urgency is clear. Anthropic is using Claude to audit other critical software, including the Linux kernel, and plans to expand its cybersecurity efforts. The message to developers is to "redouble their efforts to make their software more secure" during this current window. As AI models continue to advance, the collaboration between AI researchers and software maintainers, exemplified by Anthropic and Mozilla, will be critical in hardening the digital ecosystem against increasingly sophisticated automated threats.