ChatGPT vs Google Gemini in Peripheral Artery Disease Education: Which AI Delivers More Accurate and Readable Patient Information?
ChatGPT provided more accurate and detailed answers about peripheral artery disease than Google Gemini. Both AI tools produced content with higher reading levels than recommended for patient education.

Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini
Abstract
Background
Peripheral artery disease (PAD) is a common but often overlooked form of atherosclerosis. It significantly increases the risk of cardiovascular illness and death. As AI tools become more common sources for medical information, evaluating their accuracy and readability is critical, especially for cardiovascular conditions like PAD.
Objective
This study compares the accuracy, completeness, and readability of responses generated by OpenAI’s ChatGPT and Google’s Gemini when answering typical patient questions about PAD. Their answers were measured against the Cleveland Clinic’s reputable FAQs to assess their value as patient education resources.
Methods
ChatGPT 4.0 and Gemini 1.0 were prompted in three ways: without context (Form 1), with a patient-level prompt (Form 2), and with a physician-level prompt (Form 3). Both answered 19 PAD-related questions sourced from Cleveland Clinic FAQs. Responses were marked correct, partially correct, or incorrect. Readability was analyzed using the Flesch-Kincaid (FK) grade level, while word counts were compared. Statistical tests determined significance at p < 0.05.
Results
ChatGPT achieved 70% correct and 30% partially correct responses, with no incorrect answers. Gemini scored 52% correct, 45% partially correct, and 3% incorrect. ChatGPT demonstrated significantly higher accuracy (p < 0.05). Both AI tools produced responses with similar readability levels—mean FK grade around 10.8—higher than recommended for patient materials. ChatGPT’s answers were notably longer (p < 0.0001).
Conclusion
Both ChatGPT and Gemini can provide mostly accurate and detailed answers about PAD, suggesting potential as supplemental education tools when supervised by healthcare providers. However, their reading levels exceed recommended standards, highlighting the need for improvements in AI communication. Future research should focus on making AI-generated medical content more accessible and on assessing its effects on patient understanding and outcomes.
Introduction
Peripheral artery disease, a condition caused by narrowed arteries reducing blood flow to the limbs, affects over 230 million people globally. In the US alone, about 10 million adults over 40 live with PAD. Traditional risk factors include smoking, aging, and diabetes.
Despite its prevalence, PAD often receives less attention than coronary artery disease or stroke. This is partly because its symptoms can be vague and patients may lack awareness. Improving education and access to clear information is essential given PAD’s potential to cause serious complications.
With over 80% of US adults using the internet for health information, AI-powered chatbots have become popular tools for quick answers. This study evaluates whether AI responses from ChatGPT and Gemini align with trusted medical sources like Cleveland Clinic, which provides evidence-based patient education.
Materials & Methods
The two AI chatbots were tested using 19 commonly asked questions about PAD from Cleveland Clinic’s FAQs. Each question was asked three times with different prompts:
- Form 1: No prompt (neutral query)
- Form 2: Patient-level prompt (simplified for general understanding)
- Form 3: Physician-level prompt (technical, professional context)
Responses were reviewed by an independent evaluator and classified as correct, partially correct, or incorrect based on keyword and concept alignment with the source material.
Readability was measured with the Flesch-Kincaid grade level, estimating the US school grade needed to comprehend the text. Word, sentence, and syllable counts were collected to calculate this score.
Statistical analysis used chi-square tests for accuracy comparisons and ANOVA for readability and length, with significance set at p < 0.05.
Results
ChatGPT answered 70% of questions correctly and 30% partially correctly, with no incorrect responses. Gemini answered 52% correctly, 45% partially correctly, and had 3% incorrect answers. Accuracy differences favored ChatGPT significantly (p < 0.05).
Both chatbots had similar readability levels: ChatGPT averaged an FK grade of 10.81, while Gemini averaged 10.73. These scores are above the recommended 6th to 8th grade reading levels suggested for patient education materials.
ChatGPT’s responses were significantly longer than Gemini’s (p < 0.0001), potentially reflecting more detailed explanations.
Discussion
This evaluation reveals that both AI platforms can deliver mostly accurate information on PAD, which could support patient education if paired with healthcare professional guidance. However, the elevated reading level is a concern, as many patients may find the content difficult to understand.
Improving the clarity and simplicity of AI-generated health content should be a priority. Achieving accessible language without sacrificing accuracy will help broaden the utility of AI as patient education tools.
With AI’s growing role in healthcare communication, ongoing assessment of its output quality and impact on patient outcomes remains essential.