Letting AI design experiments is playing with fire

Tests of 19 models show confident answers that miss lab hazards. Use AI for ideas, not protocols: demand expert review, hazard lists, citations, and SOP sign-off.

Categorized in: AI News Science and Research
Published on: Jan 16, 2026
Letting AI design experiments is playing with fire

AI can miss lab hazards - here's how to protect your team

Recent tests of 19 current AI models found a consistent pattern: every model made safety mistakes that could lead to fire, explosion, or poisoning. They look competent, speak confidently, and still skip basic precautions. Some models did little better than chance when asked to spot hazards in routine experimental prompts.

We've all seen how small oversights in a lab can spiral. In 1997, Karen Wetterhahn was fatally exposed to dimethylmercury through gloves. In 2016, a lab explosion cost a researcher an arm. In 2014, another scientist was partially blinded. These incidents weren't caused by AI, but they remind us of the stakes when procedures are incomplete or assumptions go unchecked.

What the tests actually show

Researchers posed hundreds of lab- and safety-focused questions to these models. None recognized all hazards. They missed incompatible reagents, ventilation needs, PPE requirements, and waste steps. Worse, many produced convincing protocols that read well but would set someone up for harm.

That's the core problem: fluent answers, shallow safety. If you've ever had a junior student write a confident but wrong method section, you already understand the failure mode.

Why smart people still get fooled

Large language models predict plausible text. That's different from situational awareness in a lab with live variables and edge cases. They regularly omit context that a trained chemist or biologist would flag in seconds. Overconfidence and omissions are a bad mix around exotherms, toxics, pressure, and ignition sources.

Practical guardrails for scientists and lab managers

  • Treat AI as ideation, not execution. Brainstorm mechanisms, references, or high-level risk factors. Do not run AI-generated protocols without expert review.
  • Ban step-by-step instructions without sign-off. Any AI-suggested procedure, volume, temperature, pressure, or atmosphere needs PI or safety officer approval.
  • Force hazard-first prompts. Ask explicitly: "List hazards, PPE, ventilation, segregation, and waste requirements." Then verify with SDS and institutional SOPs.
  • Require citations and verify them. No citations, no action. Cross-check with primary sources such as Prudent Practices and ACS safety pages.
  • Use an allowlist. Permit literature search, summarization, and draft risk assessments; block synthesis steps, conditions, or scale-up advice.
  • Two-person rule for changes. Any AI-informed modification to an existing SOP needs a second qualified reviewer.
  • Red-team your models monthly. Test them on your lab's specific hazards. Log failures and share examples during safety meetings.
  • Integrate a safety checklist in your ELN. Every AI-assisted entry must include PPE, engineering controls, incompatibilities, and waste plan - with human initials.
  • Prefer retrieval over generation. If you use AI, bias it toward pulling your approved SOPs and policies rather than inventing new steps.
  • Track near misses involving AI. Treat them like incident reports. Trends will tell you where to tighten controls.

Copy-and-paste policy for your lab handbook

  • AI tools may be used for literature search, summarization, non-prescriptive brainstorming, and draft risk identification.
  • AI must not generate executable protocols, reagent amounts, temperatures, pressures, or atmospheres without written PI approval.
  • Hazard controls (PPE, ventilation, segregation, waste) are determined by SDS and institutional SOPs - not AI outputs.
  • Restricted procedures (e.g., energetic materials, highly toxic substances, pressurized systems) are excluded from AI assistance.
  • All AI prompts and outputs used in experimental planning must be archived in the ELN, with reviewer sign-off.

Authoritative safety references

Building safer AI use across your team

If your group is experimenting with AI for planning or documentation, invest in shared standards and skills. Clear prompts, citation rules, and review checkpoints reduce risk and save time.

For structured learning on safer prompting and workflow design, see this resource: Prompt Engineering basics and use cases.

Bottom line

AI can speed thinking, but it can't be trusted with decisions that carry physical risk. Keep humans - and your SOPs - in the driver's seat. One missed hazard is one too many.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide