AI Model Oversight Is Tightening: What Public Agencies Need to Know
OpenAI and Anthropic have agreed to grant the U.S. AI Safety Institute pre-launch access to new models for testing and risk evaluation. The goal: catch harmful behaviors before deployment and keep testing after release. Coordination with the UK AI Safety Institute will help cross-check safety findings.
The U.S. AI Safety Institute was set up under a 2023 executive order to lead testing, evaluation, and guidance for responsible AI. This move signals a shift toward pre-deployment checks and shared standards across labs and regulators. Google and others are also in discussions, indicating broader industry alignment with government oversight.
What This Means for Government Teams
- Pre-deployment testing becomes a baseline expectation. Ask vendors for evidence of third-party or Institute reviews.
- Ongoing access for evaluators suggests safety is continuous, not a one-time certification. Plan for updates and re-assessments.
- Cross-border collaboration (US-UK) will influence benchmarks and incident reporting. Align internal policies accordingly.
Regulatory Context
At the federal level, the AI Safety Institute is building methods to evaluate model risks and publish guidance for safe deployment. See the Institute's work at NIST AI Safety Institute.
In California, the proposed "SB 10147" would require safety testing for models with development costs over $1 million or that meet defined compute thresholds. It also requires a "safety switch" to stop a model if behavior goes off-track. The California Attorney General could sue for non-compliance. The bill needs one more vote and the Governor's signature by the end of September to become law.
Why This Matters for Procurement and Oversight
- Contracts: Bake in pre-deployment testing requirements, red-teaming evidence, and commitments to ongoing access for audits.
- Risk tiers: Classify AI systems by impact (public-facing, critical services, sensitive data) and set stricter testing for higher tiers.
- Incident response: Require kill-switch capabilities, abuse monitoring, and timelines for patching risky behaviors.
- Data controls: Specify access logging, data retention limits, and training data provenance disclosures.
- Reporting: Align with federal and state guidance; plan for disclosures that meet US and UK evaluation norms.
Coordination Across Jurisdictions
Expect growing alignment between U.S. and U.K. safety work. For reference, see the UK AI Safety Institute. Agencies should track shared evaluation suites and risk thresholds, then reflect them in RFPs and program reviews.
Action Checklist for Public Agencies
- Update RFP templates to require third-party model evaluations and post-deployment monitoring plans.
- Mandate a safety switch for high-impact systems and define activation criteria in contracts.
- Set vendor SLAs for incident notification, model rollbacks, and mitigation timelines.
- Establish an internal review board to vet AI use cases, risks, and compliance evidence before launch.
- Map current systems against potential state rules (e.g., SB 10147) to identify gaps now.
Upskilling Your Team
If your team is building or procuring AI, targeted training helps operationalize these requirements. Explore role-based options at Complete AI Training - Courses by Job.
Your membership also unlocks: