How to buy AI for government: a practical playbook
Agencies want results from AI, but the first real test is buying the right tools. A new guide from the Open Contracting Partnership (OCP) breaks down how public buyers can build "AI readiness" and avoid common pitfalls seen across more than 50 interviews with procurement leaders worldwide.
Three patterns stood out: off-the-shelf AI is leading adoption, centralized buying and bulk licenses are reshaping the market, and many tools are entering government use without a formal procurement. As one researcher put it, "AI adoption and AI procurement are diverging."
What "AI readiness" means in government terms
OCP defines AI readiness as the ability to assess risk, define purpose-specific use cases, and keep human oversight, auditability, and accountability in place throughout the lifecycle. In short: clear intent, controls that work, and a way to prove it.
- Governance first: agree on decision rights, risk thresholds, and review gates before pilots start.
- Dedicated ownership: appoint an AI lead to align programs and vendors. Several states have already created AI officer roles.
- Data controls: document what data can be used, where it's processed, and how it's anonymized or redacted.
- Human-in-the-loop: define when people review, override, or halt outputs.
- Auditability: keep logs, versioning, and clear model/config documentation.
- Measurement: set success criteria up front and track impact against baselines.
The three trends shaping public-sector AI buying
- Off-the-shelf dominates: Chatbots and copilots bundled with productivity suites are getting the most use. It's low friction, but impact is often modest without training, guardrails, and purpose-built workflows.
- Centralized procurement + bulk licenses: Faster deals and better pricing, but watch for lock-in, uneven adoption across agencies, and gaps in oversight.
- Shadow adoption without procurement: Free trials, academic partnerships, and "pilot by partnership" can bypass standard checks. Benefits are speed and learning; risks are security, IP, and compliance.
Practical moves that work
- Find your allies: Procurement, security, privacy, records, legal, and program leads. Build a fast review lane for low-risk pilots.
- Aim for shorter contracts: Structure 6-12 month phases with clear exit ramps tied to performance.
- Standardize deal terms: Use model clauses for audit, access to logs, incident response, data residency, rights to training data, and service levels.
- Clarify data handling: What enters the system, how it's anonymized/redacted, where it's stored, and who can access it.
- Set evaluation gates: Define metrics, test sets, and user acceptance criteria before purchase.
- Budget for training: Tools underperform if staff aren't trained. Treat enablement as part of the product.
Questions every RFP or pilot should answer
- What is the specific use case and measurable outcome we expect?
- How will IP and copyright for AI-generated content be handled?
- What processes are in place for anonymization or redaction before data enters the system?
- Which model(s) will be used? Why are they suitable for this context?
- What guardrails, toxicity filters, and bias checks are built in?
- How is human oversight triggered and recorded?
- What logs and evidence will be available for audits and discovery?
- How will we measure accuracy, cost, and time saved against a baseline?
- What happens to our data and configurations at contract end?
Myths to drop
- "Open source is always best." It offers transparency and flexibility, but suitability depends on security needs, support, TCO, and internal skills.
- "One model can do everything." Different models excel at different tasks. Fine-tuning, prompt strategy, and guardrails are what make them safe and useful in context.
Off-the-shelf vs. custom: where the gains are
Many agencies are testing the AI that comes with tools they already pay for. That's prudent and politically safer, but the wins are usually small without targeted workflows and training.
Bigger productivity gains tend to come from custom tools built for specific tasks-claims triage, case summarization, policy tracking, inspections scheduling-once governance, data access, and user adoption are in place.
Avoid "shadow AI" procurement
Partnership pilots and free trials can sneak in without oversight. Treat them like any other tech exposure.
- Require a simple intake form for every AI tool, trial, or partnership.
- Run a quick legal/security review for data sharing and IP terms.
- Log all pilots in a central register with owners, datasets, and end dates.
A 90-day starter plan
- Weeks 1-2: Pick two high-value, low-risk use cases. Write success metrics and red lines.
- Weeks 3-4: Inventory data sources. Decide what can be used, how it's redacted, and where it's stored.
- Weeks 5-6: Approve standard clauses and a pilot playbook (goals, metrics, logs, exit criteria).
- Weeks 7-8: Launch 2 small pilots with different vendors. Track accuracy, time saved, and issues.
- Weeks 9-10: Train 50 end users. Capture adoption blockers and revise workflows.
- Weeks 11-12: Keep the top performer, end the rest, and negotiate a short phase-two contract.
Policy and reference points
Build capacity so you can buy smart
The throughline from OCP's research is clear: strong vision and alignment, a named AI lead, and disciplined buying practices make the biggest difference. Short, measurable contracts beat big bets.
If you're upskilling your team, curated training can help accelerate adoption and reduce risk. See role-based learning paths here: Complete AI Training: Courses by Job.
Your membership also unlocks: