Military AI Policy by Contract: The Limits of Procurement as Governance
The United States is leaning on a flexible but fragile model for military AI oversight: regulation by contract. The Anthropic-Pentagon dispute and OpenAI's rushed deal didn't create this problem-they exposed it.
When policy gets outsourced to deal terms, the government trades public law for private leverage. That may move fast, but it rarely holds up when the stakes are domestic surveillance, autonomous targeting, or intelligence oversight.
What "regulation by contract" really means
Instead of clear statutes and durable policy, rules are being set inside bilateral agreements between agencies and vendors. Those deals bind only the parties, shift interpretive power to whoever controls the system in production, and are enforced after the fact-if at all.
In practice, the real gatekeepers are technical controls the vendor maintains once models are deployed or embedded in another contractor's platform. That is not a governance system for life-and-death decisions or rights-sensitive surveillance.
Why this crisis erupted
In January, the Defense Department ordered "any lawful use" language across AI procurements and pushed to remove vendor usage constraints-both contractual and technical. Speed first; guardrails later. The General Services Administration then floated a similar standard for civilian buys.
Anthropic's existing red lines (no mass domestic surveillance, no fully autonomous weapons) collided with that new posture. After Anthropic refused to drop them, the Pentagon designated the company a supply chain risk to national security, while Claude reportedly continued to support operations through a prime integrator on classified networks. Litigation followed, and the pressure campaign spilled into public view.
The bargaining environment: "any lawful use"
Once "any lawful use" becomes the baseline, oversight turns into an exception you must negotiate back in-and defend during operations. That framing shaped OpenAI's agreement: it ties sensitive use cases (autonomy, surveillance, intelligence activity) to external legal regimes while leaving interpretation with the government.
Two words matter. First, "consistent with applicable laws" sounds strict, but it defers to the government's live interpretation. If that shifts, enforcement arrives late, if at all. Second, "intentionally" (as in "not intentionally used for domestic surveillance") narrows the bar. Without a definition that covers incidental collection and person-specific outputs, the label on the task-not the effect-can decide whether the use is "allowed."
Contracts don't govern in real time
Federal contracts are not ordinary commercial agreements. The government can direct changes or terminate for convenience. Under FAR-based contracts, the Contract Disputes Act gives contractors a path to money damages-but performance usually continues during disputes.
With Other Transaction (OT) agreements, the Contract Disputes Act doesn't apply unless the parties write in similar procedures. Rights and remedies are whatever you negotiated, and they often sit across multiple layers when an integrator deploys the model. By the time a vendor objects or threatens termination, the operation has already run.
In national security contexts, even "pull the plug" may be unrealistic. Commanders dependent on a tool will reach for government authorities to keep it online until a replacement arrives.
Safety stacks help-but they aren't law
Vendors point to cloud-only deployments, classifiers, and refusal behavior as the last line of defense. Useful-until they collide with a DoD directive to "utilize models free from usage policy constraints that may limit lawful military applications."
That signals a clear preference: the mission decides; vendor safety policies should not. Even when tolerated, safety stacks are contingent-especially if a competitor offers fewer limits.
Procurement vehicles matter more than press releases
- FAR-based contracts bring default clauses, dispute processes, and some predictability. See the Federal Acquisition Regulation at acquisition.gov.
- OT agreements trade speed and flexibility for thinner guardrails. See DoD's overview of Other Transactions on darpa.mil.
- If the model rides through a prime integrator, critical limits may live in a separate commercial agreement you don't see-and can't enforce directly.
What government leaders can do now
- Pick the right vehicle. If you need enforceable limits, think hard before defaulting to an OT without CDA-like dispute language, audit rights, and clear remedies.
- Define terms in operational, testable ways. Spell out "domestic surveillance," "target selection," "autonomy," and "incidental collection" with concrete examples and edge cases.
- Move enforcement into the workflow. Require event-level controls: human-in-the-loop checkpoints, auditable logs, alerting for sensitive outputs, and pause authority at the program level.
- Bind what matters across layers. Flow down key restrictions to primes and subs; require integrators to expose enforcement hooks, logs, and model metadata to the government.
- Freeze and govern changes. Lock model versions and safety configurations for production use; require written approval and test evidence before any update.
- Plan for disagreement. Specify who interprets contested clauses in operations, who can pause, and how quickly escalation occurs (hours, not weeks).
- Separate missions. Restrict use by certain components (e.g., intelligence agencies) unless a new agreement is executed and independently reviewed.
- Test like you'll deploy. Red-team against prohibited use cases; validate that technical and process controls actually block them under realistic load.
- Train the team. Contracting officers, program managers, and counsel need a shared playbook for AI-specific risks. See AI for Government and the AI Learning Path for Procurement Specialists.
What belongs to Congress and the courts
Some questions should not live inside individual deals. Congress and the courts must clarify the limits of domestic AI-enabled surveillance, acceptable degrees of autonomy in force application, and the oversight rules for intelligence use of general-purpose models.
Until that happens, "any lawful use" will mean exactly what the executive interprets in the moment-and contracts will be a poor substitute for public law.
The bottom line
Procurement can buy capability; it cannot stand in for democratic governance. If you rely on contracts to draw red lines, expect those lines to move. Put enforceable controls in the workflow, choose your vehicles with intent, and push the hardest questions back to where they belong: statute, regulation, and independent oversight.
Your membership also unlocks: