OpenAI launches ChatGPT for Clinicians as its GPT-5.4 model outscores doctors on its own benchmark

OpenAI launched ChatGPT for Clinicians Friday, targeting medical documentation and research tasks. Its benchmark shows the AI outscoring doctors 59.0 to 43.7-though OpenAI designed that benchmark itself.

Categorized in: AI News Healthcare
Published on: Apr 25, 2026
OpenAI launches ChatGPT for Clinicians as its GPT-5.4 model outscores doctors on its own benchmark

OpenAI Launches ChatGPT for Clinicians, Claims Performance Edge Over Doctors

OpenAI introduced ChatGPT for Clinicians on Friday, a specialized AI tool designed for doctors, nurse practitioners, medical assistants, and pharmacists. The company claims its GPT-5.4 model outperforms human physicians on certain clinical tasks, scoring 59.0 on OpenAI's internal HealthBench Professional benchmark versus 43.7 for doctors, even with unlimited time and internet access.

The tool targets the administrative burden that consumes physician time. It automates medical documentation, synthesizes research from peer-reviewed sources, and handles routine tasks like authorization letters. OpenAI said it worked with hundreds of doctors to develop the system and analyzed more than 700,000 model responses during testing.

What the validation shows

During testing, physicians judged that 99.6% of responses were safe and reliable across nearly 7,000 conversations. The AI integrates clinical research from millions of peer-reviewed sources and offers modes for in-depth analysis of scientific literature.

OpenAI said conversations will not be used to train its models. The company also offers devices that comply with HIPAA requirements to protect patient data.

The adoption trend

AI tool usage among doctors has accelerated. OpenAI reports that 72% of doctors now use artificial intelligence tools, up from 48% a year earlier. ChatGPT usage in healthcare has more than doubled over the same period.

The company frames the tool as a solution to structural problems in healthcare: administrative overload and staffing shortages.

A critical detail

OpenAI designed the HealthBench Professional benchmark used to measure the AI's performance. The company acknowledged this in its announcement, creating questions about the objectivity of the results.

OpenAI stated the tool is "designed to accompany clinical tasks" and not to replace medical judgment. This distinction matters as AI moves into fields where human responsibility remains essential.

The entry of ChatGPT into healthcare raises longer-term questions about how systems will be independently validated, how trust will be established, and how regulation will evolve as AI for Healthcare becomes embedded in clinical practice.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)