AI Creates Deadly
A Microsoft-led team found AI

'Up to 100%' of AI-Crafted Toxins Escaped DNA Screens, Microsoft-Led Team Finds
AI-driven protein design can paraphrase toxin sequences in ways that slip past DNA order screening. A Microsoft-led red-team found that some AI-generated ricin variants evaded detection entirely. Published in the journal Science, the study shows how current safeguards can fail against AI-designed biological variants.
Microsoft tested whether AI could fool DNA screens
Beginning in late 2023, Eric Horvitz and Bruce Wittmann led a red-team exercise to probe DNA screening weaknesses. Using open-source protein design models, the team digitally reformulated 72 legally controlled proteins, including ricin, botulinum, and Shiga toxins. They produced over 70,000 synthetic DNA sequences and ran them through the same biosecurity software used by DNA synthesis companies. None of the sequences were manufactured.
The first AI-biosecurity "zero day"
Native toxin sequences were often flagged, but AI-altered variants commonly slipped through. For some ricin-like variants, detection dropped to zero. One platform flagged around 23% of toxic variants; another missed more than three-quarters. After the team raised the alarm, vendors shipped updates that lifted average detection to about 72%, with nearly all of the highest-risk designs now caught. Researchers framed this as the first "zero day" spanning AI and biosecurity.
Study context: see the journal Science. For industry screening standards, see the International Gene Synthesis Consortium (IGSC).
Why this matters for labs and vendors
The core issue is functional mimicry: small changes guided by AI can preserve toxic behavior while bypassing sequence-based checks. Screening at order fulfillment is necessary but insufficient on its own. Safeguards also need to live inside AI tools that can generate biological designs. Gaps persist because some DNA vendors still do not screen orders, creating avoidable exposure.
Action checklist for research leaders and AI tool builders
- Adopt layered screening that combines sequence-based checks with model-informed functional risk assessments at multiple points in the workflow.
- Require suppliers to use IGSC-aligned screening and obtain documented attestations for every order.
- Gate advanced model capabilities behind documented use policies, user verification, and human-in-the-loop review for sensitive queries.
- Run regular red-team exercises with coordinated disclosure to vendors and maintain a rapid patch process for screening updates.
- Log and audit model interactions relevant to bioactivity design, with privacy-conscious anomaly detection for misuse patterns.
- Limit training and fine-tuning data that directly encode hazardous functions, and avoid releasing weights that enable unrestricted design.
- Establish clear escalation paths with institutional biosafety committees and security teams for ambiguous cases.
- Participate in shared benchmarks and interlaboratory evaluations to measure screening performance against AI-generated variants.
Open questions to track
- How quickly can screening benchmarks incorporate new AI design techniques without leaking misuse detail?
- What regulatory floor will require universal screening across all synthesis providers?
- Which combinations of sequence, structure, and function prediction best balance sensitivity with false positives?
- How should the community handle responsible release and access control for increasingly capable protein design models?
Image source: mstandret/Envato