Military AI Policy Needs Democratic Oversight
A standoff between the Department of Defense (DOD) and Anthropic has turned into a test case for who sets the rules for military AI. The fight is bigger than one contract. It asks a core question: do guardrails come from the executive branch, private vendors, or Congress and the public process?
According to reports, Defense Secretary Pete Hegseth gave Anthropic's Dario Amodei a deadline to open the company's models to unrestricted DOD use. After the company refused - citing two red lines: no domestic surveillance of Americans and no fully autonomous targeting - the administration moved to label Anthropic a supply chain risk and ordered agencies to phase out its technology. That shifts a procurement dispute into coercive leverage.
What's actually at issue
- Domestic surveillance: Anthropic's refusal aligns with established civil liberties constraints. Government compliance should be enforced by law and oversight, but technical guardrails can reinforce those limits in practice. See the Privacy Act of 1974 for the baseline.
- Autonomous targeting: The DOD already requires human judgment in the use of force and has updated policy on autonomy in weapons. The debate is active and nuanced, which is exactly why it should be governed by doctrine and statute, not ad hoc deals. Reference: DOD update to Directive 3000.09 (Autonomy in Weapons Systems).
Procurement vs. pressure
In a normal market, the government buys what it needs, and companies sell what they're comfortable providing. If there's no match, both move on. That symmetry breaks when "supply chain risk" tools intended for true security threats are used to punish a domestic firm for its contractual terms.
Blanket bans that ripple across the federal ecosystem will invite legal backlash and chill participation. They also push vendors to weaken safeguards to stay eligible - the opposite of what you want with high-consequence systems.
The core governance problem
Two institutions are clashing: law and code. The DOD argues lawful use should be controlled by government, not baked into a vendor's model. Vendors argue some constraints must live in the technology to prevent misuse, error, or escalation in real time.
Both are partly right. Layered oversight is standard in other high-risk domains: legal review, auditing, internal controls, and technical fail-safes work together. AI should be no different.
What Congress should do now
- Codify boundaries for military AI use, including prohibitions on domestic surveillance of U.S. persons without statutory authority and clear thresholds for human control in targeting.
- Require independent test, evaluation, verification, and validation (TEVV) for mission use, including red-teaming against misuse and escalation scenarios.
- Mandate event logging, auditability, and post-action review for AI-enabled operations; set retention and access rules.
- Clarify when "supply chain risk" authorities apply and prohibit their use to enforce negotiated deployment terms absent a bona fide security threat.
- Require regular reporting to defense committees on AI deployments, incidents, overrides of safeguards, and corrective actions.
- Fund joint test ranges and data-sharing frameworks for AI safety evaluation across services and labs.
To build internal capacity quickly, see our AI Learning Path for Policy Makers.
What DOD leadership should publish
- A detailed doctrine on human judgment in the use of force: define "meaningful human control," decision timelines, fail-safe triggers, and authorized overrides.
- A model approval framework: baseline evaluations, mission-specific certification, continuous monitoring, and incident reporting.
- Acquisition language that welcomes vendor safeguards as complementary - not determinative - guardrails, with a process for government overrides under strict audit.
- An escalation path to resolve vendor-government disagreements before any blacklisting action.
Guidance for CIOs, acquisition, and program managers
- Require vendors to disclose model safeguards, override mechanisms, and logging by default. No "black box" deployments in operational contexts.
- Map model capabilities to mission risks. If the mission cannot tolerate model behavior limits, select or fine-tune systems designed for that risk profile - and document the rationale.
- Write contracts that separate lawful use authority (government) from safety engineering (vendor) with clear interfaces, audit trails, and redress.
- Adopt the NIST AI Risk Management Framework internally and require alignment from vendors; verify through independent assessment.
- Plan for redundancy: approved fallback models, human fallback procedures, and circuit-breakers for escalation.
Sample procurement language (adapt as needed)
- The Government retains sole authority to determine lawful use and compliance with applicable statutes, regulations, and rules of engagement.
- The Contractor will implement technical safeguards aligned to the system risk profile, including rejection of prohibited tasks, rate limiting, provenance tracking, and immutable logging.
- Safeguards must support authorized Government overrides under defined conditions with recorded justification and after-action review.
- Deployment is contingent on independent TEVV results meeting minimum performance and safety thresholds specified in the task order.
- Material changes to model behavior or safeguards require Government notification, delta testing, and approval prior to operational use.
Oversight questions for briefings and hearings
- What specific mission needs are impeded by vendor safeguards, and what alternatives were evaluated?
- How does the program define and verify "meaningful human control" for each kill chain phase?
- What is the override process, who can authorize it, and how is it logged and reviewed?
- What incidents, near misses, or model failures have occurred during testing or operations, and what changed afterward?
- How do you detect and respond to model degradation, drift, or adversarial manipulation?
- What is the plan for contested environments with degraded comms where human review may be delayed?
- How do you prevent domestic surveillance of U.S. persons and enforce minimization and auditing?
- Under what conditions would you seek to debar or blacklist a vendor, and what due process is guaranteed?
Why this matters for every agency
If guardrails can be removed through contract pressure, they become negotiable. If they are grounded in law and doctrine, they become stable expectations for everyone - government, industry, and the public.
Bottom line: The DOD should keep authority over lawful use. Vendors should keep responsibility for safety engineering. Congress should set the boundaries and require layered oversight. That balance delivers capability without sacrificing democratic control.
For more public-sector guidance, explore AI for Government.
Your membership also unlocks: