DARPA bug-hunting contest produces open-source AI tools that find flaws in Linux, Android and critical infrastructure software

AI tools built for a DARPA competition have found 83 vulnerabilities in Android, Linux, SQLite, and Redis across 30-plus projects. Critical infrastructure operators have been slow to adopt them despite lower costs than commercial alternatives.

Categorized in: AI News Government
Published on: May 19, 2026
DARPA bug-hunting contest produces open-source AI tools that find flaws in Linux, Android and critical infrastructure software

Open-Source AI Tools From DARPA Contest Are Already Finding Critical Bugs

Security researchers who competed in the Defense Advanced Research Projects Agency's AI Cyber Challenge have spent the past months deploying their systems to find and fix vulnerabilities in critical software. The tools they built are now discovering serious flaws in widely used packages - work that could reshape how government and infrastructure operators approach cybersecurity.

DARPA announced three competition winners in August 2025. Since then, the agency created a $1.4 million bonus prize pool for finalists willing to hunt vulnerabilities in important open-source software. Seven teams competed for up to $200,000 each, with a maximum of $10,000 per project they evaluated.

By March 2026, those teams had found 83 vulnerabilities across more than 30 commercial and open-source projects, including Android, Linux, SQLite and Redis. The government awarded $830,000 of the available prize money.

What the tools actually found

Team Atlanta, the competition winner, identified flaws in the U-Boot boot loader and core Apache libraries. A finalist called 42-b3yond-6ug discovered vulnerabilities in the Linux kernel that could have allowed attackers to disable devices embedded throughout critical infrastructure.

Theori, which placed third, deployed its system called Xint to scan widely used open-source projects. The tool found vulnerabilities in Redis, Postgres, MariaDB, Python, Linux and Apple's XNU kernel.

The AI systems proved especially effective at finding "logic bugs" - flawed code that traditional security software wouldn't flag. As these tools improve at understanding context, they can identify problems that conventional scanners miss.

The real advantage may be what happens after a vulnerability is found. The systems can validate their findings and generate patches automatically. Critical infrastructure organizations typically run customized hardware and software that makes testing patches difficult or impossible. Because DARPA required competition teams to develop patch-validation capabilities, their latest systems now include this feature.

The adoption problem

DARPA has tried to connect the winning teams with critical infrastructure operators and their vendors. The agency briefed sector coordinating councils and facilitated introductions between vulnerability hunters and infrastructure firms.

Progress has been slow. Theori has signed vulnerability hunting agreements with fewer than five critical infrastructure entities. Many organizations don't understand how the AI systems would work in their environments. Others believe their existing security teams are sufficient. Some are interested but lack the necessary approvals.

"It's been a bit difficult to convince those slower companies and industries to adopt this tech," said Tyler Nighswander, a researcher at Theori.

Trail of Bits achieved better results through a partnership with the Department of Health and Human Services to hunt for flaws in medical devices. That project has fixed many vulnerabilities through strong partnerships with healthcare providers and suppliers.

Even without direct engagement from infrastructure operators, the vulnerability hunters' work will have downstream effects. Infrastructure vendors routinely use lightweight open-source packages in embedded devices. Finding and fixing flaws in those packages helps organizations that never directly contracted with the research teams.

Cost advantage over commercial alternatives

When Anthropic announced Claude Mythos and OpenAI released similar tools for vulnerability detection, the DARPA competition finalists saw them as confirmation that the field was moving in the right direction - not as competition.

The open-source systems have a decisive advantage: cost. Commercial AI tools from major companies can cost tens of thousands of dollars in access tokens for a single vulnerability assessment. Using Claude for bug hunting is "kind of like showing up to a fancy restaurant with no prices on the menu," said Trent Brunson, Trail of Bits' director of research and development.

Cash-strapped critical infrastructure firms may choose the DARPA finalists' cheaper but similarly effective services over expensive commercial alternatives. "More companies are going to look at the bottom line," Brunson said, "rather than just throw AI tokens at it."

What comes next

The winning teams have taken different paths since the competition ended. Theori commercialized Xint and now contracts with businesses to evaluate their products. Trail of Bits is focusing primarily on open-source packages, saying that commercializing its tool would fundamentally change the company.

Both teams have had to modify their systems for real-world use. Competition tools were designed to find synthetic flaws that DARPA created. Production systems need to find actual vulnerabilities, generate human-readable reports and handle a wider range of inputs than the structured competition required.

Trail of Bits built an entirely new system to analyze firmware in embedded devices, which is written in binary rather than source code. Binary analysis is harder for AI because it doesn't resemble natural language the way source code does. Solving that problem, Brunson said, opens significant opportunity.

The broader shift

Security assessments that once required multiple people working for six months can now be completed by AI in hours, often with better results. "That scale and efficiency is incredible," Nighswander said.

The technology cuts both ways. The same tools that help defenders find vulnerabilities could help attackers if they gain access to the systems. But DARPA views the opportunity as outweighing the risk.

The agency noted that when it ran its first major competition - the self-driving car challenge in 2004 - it took years for the technology to reach the market. With the AI bug-fixing competition, DARPA achieved something it didn't expect: technology that was both technically sound and economically feasible.

"I'm extraordinarily excited at the performance and impact that the technology continues to have," said Andrew Carney, the DARPA program manager overseeing the effort.

For government officials responsible for infrastructure security, the implication is clear: AI-based vulnerability detection is no longer theoretical. It's working now, finding real flaws in real software, and available at a cost that makes adoption practical.

Those managing cybersecurity operations may want to explore AI for Cybersecurity Analysts to understand how these systems work and how they might fit into existing security programs.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)