Pentagon AI contracts sharpen debate on autonomous weapons and surveillance
Tension between leading AI firms and the US Department of Defense is escalating. At issue: how far military AI can go, what "human oversight" actually means, and whether contract language is strong enough to prevent mass surveillance and fully autonomous weapons.
Anthropic CEO Dario Amodei said his company declined proposed Pentagon terms, warning that reduced safeguards on its Claude model could open the door to domestic mass surveillance or fully autonomous weaponization. The Pentagon rejected that view; spokesperson Sean Parnell said the military does not intend to surveil US citizens or field autonomous weapons without human involvement.
In contrast, OpenAI CEO Sam Altman said his company reached a deal with the Pentagon. He noted the models would run on a classified network with explicit bans on mass surveillance and requirements for human oversight. Legal experts, however, caution that principle-based clauses may not hold up as airtight guardrails in practice.
What's actually in dispute
International law expert Mustafa Tuncer highlights the core uncertainty: key terms sit on shifting ground. "Mass surveillance" is loosely defined in US law, and the Defense Department's definition of autonomous weapons-rooted in policy documents that can change-sets a moving target for compliance.
Under current Defense Department policy, autonomous weapons are systems that, once activated, can select and engage targets without further human input-though they may include self-monitoring and termination features. International humanitarian law does not directly regulate AI; existing weapons rules are applied instead, and practice varies widely by country.
Tuncer notes that statements made outside formal contracts do not change legal exposure if things go wrong. He also points out a hard question for counsel: how binding are peacetime policies and ethics commitments once active combat begins?
The legal backdrop worth a close read
For US practice, the reference point is DoD policy on autonomy in weapon systems. It frames "human involvement" and review processes but leaves room for future updates and interpretation-an obvious risk factor for long-lived contracts.
- DoD Directive 3000.09: Autonomy in Weapon Systems (2023)
- ICRC: Autonomous weapons and international humanitarian law
Why this matters for legal teams
Policy language sounds reassuring until definitions shift or exceptions appear under operational pressure. Classified deployments further complicate verification, discovery, and third-party audit. If your client supplies models, tools, data, or integration services, vague terms create long-tail liability you can't price.
The gap between "intent" and "implementation" is where risk lives. Your job is to close that gap in writing, with objective tests and operational controls-not just promises.
Contract playbook: clauses that reduce risk
- Definitions with teeth
- Define "mass surveillance" with specific data sources, selectors, retention periods, query constraints, and review thresholds.
- Define "autonomous weapon system" by reference to functions (target selection/engagement without further human input) and explicitly exclude them from scope unless separately authorized.
- Define "human oversight" (e.g., trained, accountable operator with real-time intervention authority and documented kill-switch access).
- Use-of-technology boundaries
- Prohibit domestic surveillance of US persons absent explicit statutory authority, warranting, and independent legal review; bind subcontracts and affiliates.
- Ban lethal target selection/engagement without human-in-the-loop unless a new written amendment is executed after risk review.
- Restrict fine-tuning or tool integration that could enable prohibited functions without prior supplier consent.
- Verification and oversight
- Pre-deployment testing and evaluation protocols; scenario libraries; red-teaming; pass/fail gates documented and reviewable.
- Operational logging, immutable audit trails, and secure retention; supplier audit rights even on classified systems via cleared third parties.
- Incident reporting with strict timelines, root-cause analysis, and corrective action plans.
- Safety controls
- Mandate human override, rate limiters, and geofencing where feasible; require verified kill-switch procedures and drills.
- Require dynamic risk scoring for deployment contexts and automatic degradation to safer modes when confidence drops.
- Data governance
- Scope data access; prohibit re-identification; set minimization, retention, and deletion obligations; document data lineage.
- Bar training on operational or citizen data unless expressly allowed; require differential privacy or equivalent controls for any analytics on US-person datasets.
- Legal alignment and change management
- Warrant compliance with DoD AI Ethical Principles and applicable directives; map to FAR/DFARS, ITAR/EAR, and privacy laws (e.g., ECPA, FISA, state laws).
- Change-in-law clause with right to suspend, re-price, or terminate if legal or policy shifts expand risk beyond agreed scope.
- Security and supply chain
- Security controls aligned to NIST SP 800-171/CMMC where applicable; incident response obligations; clearance handling for classified work.
- Software bill of materials (SBOM), model card transparency, and provenance requirements for datasets and third-party components.
- Liability and remedies
- Carve-outs from liability caps for prohibited uses, willful misconduct, and breach of surveillance or autonomy restrictions.
- Insurance requirements; indemnities for regulatory actions and third-party claims tied to prohibited uses.
- Governance in practice
- Joint risk committee with escalation paths and veto rights over deployments that shift use cases or risk class.
- Periodic independent audits and model drift monitoring; suspension triggers tied to threshold breaches.
- Transparency and user controls
- Operator training, certification, and documented SOPs; clear responsibility assignment (RACI).
- User-facing disclosures and control surfaces for override, logging, and feedback loops.
Open questions to flag for clients
- How will "human oversight" be evidenced in real time and after-action review, especially on a classified network?
- What happens if the DoD revises the autonomous weapons directive, or adopts broader definitions that touch current systems?
- Which laws and forums govern disputes tied to overseas deployments, coalition operations, or covert activities?
- How are exceptions in active combat documented, and who signs off?
Bottom line
Intent is not a control. For counsel, the priority is specific definitions, measurable safeguards, and enforceable oversight with audit rights. Build for change: assume definitions and policies will shift, and give your client the contractual levers to pause, adapt, or exit without absorbing unbounded risk.
For deeper practical resources on clause design, risk allocation, and compliance checklists, see AI for Legal. For public-sector governance insights related to procurement and oversight, explore AI for Government.
Your membership also unlocks: