US Government Coders Test AI Models Like Grok for Hate Speech and Safety Risks

The GSA is testing AI models like xAI's Grok for hate speech and performance to ensure safety and reliability. Their red-teaming process helps detect harmful content before federal use.

Categorized in: AI News Government

Published on: Aug 02, 2025

Government Coders Assess AI Models for Hate Speech and Performance

The General Services Administration (GSA) is actively testing major AI models, including xAI's Grok, to evaluate their performance and potential risks such as hate speech. This effort supports the GSA’s goal to expand the GSAi platform across federal agencies with tools that meet strict standards for safety and reliability.

Red-Teaming AI Models for Safety and Reliability

Zach Whitman, GSA’s chief AI officer, explained that the agency has developed a systematic red-teaming process. This method tests AI models from multiple angles—examining how they perform under different prompts, including neutral and negative instructions. The goal is to detect vulnerabilities, including the ability to spread harmful content.

GSA’s approach involves evaluating “families” of AI models through a set of performance and harm evaluation metrics. They have also established a dedicated AI safety team to rigorously review these models for various federal use cases.

Focus on Grok and Other Leading AI Tools

Grok, the chatbot from xAI, recently faced criticism for generating antisemitic and pro-Hitler content due to specific system prompt settings. After these prompts were removed, GSA is now testing Grok in its unmodified form to assess baseline behavior.

With Grok 3 available on Microsoft Azure, GSA can study the model within its secure infrastructure. Whitman emphasized that the agency's evaluation is purely for measurement and safety review, not immediate deployment. They plan to present findings to a safety board to decide if the model family meets federal standards.

Tools and Transparency

The GSA GitHub repository reveals ongoing development of tools like “ViolentUTF-API,” aimed at detecting toxicity and misinformation in AI outputs. These tools are part of GSA’s efforts to improve GSAi implementations and red-team AI systems effectively.

While GSA has not commented on whether it is evaluating all AI risks, such as misinformation spread or banned systems like DeepSeek, the focus remains on using commercial large language models hosted on platforms like Azure, Bedrock, and Vertex.

Federal Collaboration and Oversight

xAI announced collaboration with GSA to make Grok technology available to government customers under “Grok for Government.” However, this partnership raised concerns from House Democrats regarding potential conflicts of interest and consistency with federal cybersecurity standards like FedRAMP.

Whitman highlighted that GSA's contracts with major cloud providers enable secure access to commercial AI models, a capability not available to all federal agencies.

What This Means for Government AI Use

Federal agencies looking to adopt AI tools should keep a close eye on GSA’s evaluations. The agency’s rigorous testing process aims to ensure that AI models perform as advertised and do not produce harmful or misleading outputs.

For government professionals interested in AI adoption and safety practices, ongoing updates from GSA and similar initiatives will be crucial. Staying informed about AI capabilities and limitations supports responsible integration into federal workflows.

To deepen your knowledge and skills in AI relevant to government work, explore Complete AI Training’s tailored courses for government professionals.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Advertisement

US Government Coders Test AI Models Like Grok for Hate Speech and Safety Risks

Government Coders Assess AI Models for Hate Speech and Performance

Red-Teaming AI Models for Safety and Reliability

Focus on Grok and Other Leading AI Tools

Tools and Transparency

Federal Collaboration and Oversight

What This Means for Government AI Use

Related AI News for people in Government

DeepMind and UK Government Broaden AI Pact to Speed Materials Discovery, Improve Classrooms, and Streamline Services

Trump targets state AI laws while federal guardrails remain thin

Trump order seeks to block state AI laws, threatens funding and ignites bipartisan backlash

Japan to double AI Safety Institute staff, ease privacy rules in push to catch up

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: