OpenAI, Google, and Perplexity near FedRAMP green light: what federal buyers should expect
Three AI companies-OpenAI, Google, and Perplexity-are reportedly on the verge of receiving approval to sell their models to the U.S. government on their own cloud systems. The authorization is expected at a "low impact" and pilot level, which is limited in scope but meaningful for procurement and control.
For agencies, this signals a shift: direct access to model providers without routing through established intermediaries for hosting. It won't replace agency ATOs or purchasing decisions, but it will expand options and reduce handoffs.
What's actually changing
- Direct hosting by the model providers: Instead of relying on partners like Microsoft, AWS, or Palantir, these vendors would operate on their own clouds for government users-within a low-impact boundary.
- Pilot-level scope: Expect constraints around data classification, user counts, and integrations. Think controlled trials and limited production, not mission-critical workloads on day one.
- Separate from your ATO: A FedRAMP authorization does not equal agency ATO. You'll still need your own authorization for your use case and environment.
Why independence from intermediaries matters
AI vendors have leaned on contractors that already cleared security reviews to reach federal users fast. That convenience came with trade-offs: product timelines, feature availability, and acceptable-use choices were influenced by partners hosting the stack.
Recent tensions between Anthropic and the Pentagon highlight the risk. Media reports describe disagreements over use constraints (such as autonomous weapons and mass surveillance) and questions about how the model was deployed operationally through a partner's systems. Direct hosting reduces those cross-vendor frictions and gives agencies clearer lines of accountability.
FedRAMP 20x, in plain terms
OpenAI, Google, and Perplexity pursued an expedited review via the federal "FedRAMP 20x" initiative last year. The expected outcome: low-impact authorizations that let them engage agencies directly while they mature controls and documentation.
If you're new to FedRAMP, this is about standardized security assessment and continuous monitoring for cloud services. You still scope, assess risk, and issue your own ATO. See the program overview on FedRAMP.gov.
What this means for your acquisition and IT teams
- More vendor choice: You can pilot directly with model providers, compare latency, performance, data handling, and audit capabilities side-by-side.
- Clearer responsibility model: Fewer overlapping SLAs and security boundaries. One owner for the model, runtime, logging, and incident response-at least within the low-impact perimeter.
- Faster pilots, careful scaling: Pilots may spin up faster, but elevating to higher impact levels will still take time and evidence (control inheritance, boundary diagrams, pen tests, etc.).
Guardrails to keep front and center
- Use-case constraints: Validate acceptable-use policies upfront (e.g., restrictions related to targeting, surveillance, or fully autonomous actions). Align with your agency directives and mission rules of engagement.
- Human-in-the-loop: Build in review and approval points for sensitive workflows. Align controls with the DoD Responsible AI Tenets where applicable.
- Data classification: Low-impact pilots should exclude controlled unclassified information (CUI), PII at scale, and anything that could trigger higher baselines. Keep the data diet clean until the boundary proves itself.
- Logging and traceability: Require detailed audit logs (prompts, outputs, system events) with retention and export to your SIEM. Make sure the vendor's monitoring story holds up during incident response drills.
Key due diligence questions for pilots
- What specific FedRAMP boundary is authorized, and what's excluded? How do they segment training data from inference data?
- Are prompts/outputs used for training or fine-tuning by default? Can that be disabled and contractually locked?
- How are model updates tested and rolled out? Can you pin versions for continuity of operations?
- What red-teaming and safety evaluations are performed, and can you review evidence regularly?
- What is the plan and timeline for moderate/high impact authorization, and what agency sponsorship is needed?
- How is content filtering handled? Can you configure policy, egress controls, and token-level logging?
Procurement moves to make now
- Pre-build pilot packages: Intake form, risk questionnaire, data inventory, and a standard pilot evaluation rubric (usability, accuracy, safety, performance, cost).
- Update contract language: Non-training clauses, IP/derivative work rights, data deletion timelines, model versioning commitments, and breach notification SLAs.
- Stand up a sandbox: A low-impact enclave with strict egress controls and role-based access. Keep it repeatable for multiple vendor comparisons.
- Decide on human review points: Define which outputs require second-party validation before operational use.
Where Anthropic fits
Anthropic has leaned more on partners like Palantir to reach government customers and, according to public reporting, did not participate in FedRAMP 20x. The company has said it wants the option to provide services directly to governments in the future.
Practically, this means your short list may split between vendors offering direct hosting now and those still working through partners. Evaluate both tracks with the same standards and a consistent pilot framework.
Bottom line for agencies
Assuming approvals land as expected, you'll soon have direct paths to pilots with OpenAI, Google, and Perplexity under low-impact constraints. Move quickly, but keep your controls tight: small, well-instrumented experiments that generate evidence for scaling-or stopping.
The stack you choose in pilots often becomes the stack you scale. Make the comparison fair, transparent, and auditable from the start.
Further resources
Your membership also unlocks: