Philippines pushes PhilHealth digital overhaul as study finds half of AI health responses are wrong

The Philippines is building a national digital health system under a new executive order, while a BMJ study found nearly half of AI chatbot responses to medical questions were inaccurate or misleading.

Categorized in: AI News Healthcare
Published on: May 19, 2026
Philippines pushes PhilHealth digital overhaul as study finds half of AI health responses are wrong

Philippines builds digital health system as AI chatbot risks mount

The Philippine government is overhauling its national healthcare infrastructure with digital tools-even as new research shows that free AI chatbots give incorrect or incomplete medical advice nearly half the time.

President Ferdinand "Bongbong" Marcos Jr. signed an executive order in March creating an inter-agency body to oversee the Philippine Health Insurance Corporation's (PhilHealth) digital transformation. The order, made public on May 12, establishes a project management group (PMG) to design and deploy a "comprehensive, integrated, interoperable, progressive, secure, and sustainable Philippine Digital Health System."

Who leads the effort

PhilHealth will chair the PMG, with the Department of Information and Communications Technology serving as co-chair. The Department of Health and Department of Budget and Management also sit on the group.

The PMG will create a multi-year roadmap with clear targets and timelines. It will oversee the centralized design, development, integration, and deployment of PhilHealth's digital systems while ensuring compliance with national standards on cybersecurity, data privacy, and interoperability.

Funding comes from existing appropriations of member agencies. The order takes effect immediately upon publication in the Official Gazette.

The AI accuracy problem

A peer-reviewed audit published in BMJ Open Journals on April 14 tested five free AI chatbots-Gemini, DeepSeek, Meta AI, ChatGPT, and Grok-on 250 health questions about cancer, vaccines, stem cells, nutrition, and athletic performance.

Researchers from UCLA, the University of Alberta, and Wake Forest University found that 49.6% of responses were problematic. Of those, 30% were "somewhat problematic" and 19.6% were "highly problematic."

ChatGPT performed best, delivering largely accurate medical information. Grok produced the worst results, with 30% of responses tagged as "highly problematic" and 58% rated as problematic overall. Researchers attributed Grok's poor performance to its training data from X (formerly Twitter), a platform known for rapid misinformation spread.

Why chatbots fail at health advice

AI chatbots don't reason or consult medical experts. They generate responses by matching statistical patterns from training data and predicting likely word sequences-not by weighing evidence or making clinical judgments.

Researchers used an adversarial approach, intentionally framing questions to elicit bad advice. They asked whether 5G causes cancer, which alternative therapies beat chemotherapy, and how much raw milk improves health. Of 250 questions, only two were refused-both from Meta AI on anabolic steroids and alternative cancer treatments. Every other chatbot continued answering.

"This behavioural limitation means that chatbots can reproduce authoritative-sounding but potentially flawed responses," the researchers said.

What's next for AI for Healthcare

The researchers called for public education, professional training, and regulatory oversight to ensure AI supports rather than undermines public health.

Healthcare professionals implementing digital systems should understand these limitations. As the Philippines builds its digital infrastructure, the gap between AI capability and clinical safety remains a core concern for any organization deploying these tools in patient-facing roles.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)