Generative AI Drafts for Patient Messages in EHR InBaskets: Low Utilization, Role-Based Preferences, and Modest Efficiency Gains
AI-drafted replies sped responses ~7% but 'draft everything' bloats review time. Value comes from concise, role-aware prompts and targeting complex messages, with fewer handoffs.

Generative AI in Patient Messaging: What Leaders in Communications Can Use Today
A large New York City health system ran an 11-month pilot of AI-drafted replies inside the EHR inbox. The results read like a playbook for any organization managing high-volume, high-stakes messaging-healthcare, customer support, PR, or member services.
The bottom line: AI can speed response times, but "always-on" drafts create review burden. Adoption rises when drafts are concise, informative, and aligned to the responder's role. Strategy, prompts, and workflow design-not the model alone-determine value.
Key numbers you can act on
- Overall utilization: 19.4% of eligible messages used the AI draft.
- Prompt upgrades mattered: usage rose from ~12% to ~20% after a super-prompt update.
- Waste risk: drafts were generated for all messages, even when 80% needed no reply.
- Speed: when used, drafts cut median turnaround time by 6.76%.
- Workload: a slight 1.95% increase in actions per message, but fewer responsibility handoffs.
- Preference signals: responders chose fewer words, higher information density, and clearer readability-physicians wanted concise/complete; support staff leaned empathetic.
What this means for Communications and PR leaders
AI assistance should target the right message types at the right moment. "Draft everything" bloats the inbox and creates hidden review time. Your goal is fewer, better drafts-delivered where they clear cognitive friction and accelerate decisions.
Adoption levers that work
- Iterate prompts, not just policies: usage moved meaningfully with better instructions.
- Match draft style to role: brevity and completeness for expert reviewers; warmer, more supportive tone for support staff.
- Prioritize complex inquiries: utilization rose when incoming messages were harder to read. That's where AI actually helps.
- Reduce handoffs: drafts correlated with fewer responsibility shifts-good for cycle time and ownership.
Draft design guidelines that increase use
- Keep it short, but information-dense: fewer words, higher entropy (less fluff, more substance).
- Improve readability: higher Flesch scores increased utilization.
- Be relevant, not repetitive: high semantic similarity helps, copy-paste overlap hurts.
- Tune empathy by role: clinical support favored more affective language; physicians did not.
Workflow rules to prevent AI clutter
- Don't auto-draft for every message. Trigger drafts only for messages likely to need a response (e.g., classified intents, message complexity thresholds, or SLA risks).
- Gate drafts behind a "Generate" button for low-value categories (e.g., thank-you notes).
- Embed drafts where work happens. Side-by-side with the original message beats a separate tool.
- Limit tokens and sentences. Offer expandable "more detail" on demand.
Metrics that prove value (and stop "AI theater")
- Utilization rate by segment: overall, and by role (physician vs. support).
- Turnaround time and first-response time: compare AI vs. non-AI replies.
- Action count and handoffs per message: seek fewer shifts of responsibility.
- Message complexity vs. AI usage: confirm you're assisting where it's hard.
- "No-reply" draft generation rate: drive this to near-zero.
Implementation playbook
1) Start small and specific
- Pick 3-5 message intents with clear policies (test results, paperwork, scheduling).
- Define guardrails: what the draft must include, what it must avoid, and when to escalate.
2) Ship prompts that mirror your process
- Codify role expectations: who can say what, in what tone, with which disclaimers.
- Add "If uncertain, ask" logic to reduce confident wrong answers.
3) Train teams on editing, not composing
- Teach a fast three-pass review: facts, tone, compliance.
- Give example libraries: approved phrases for empathy, boundaries, and de-escalation.
4) Govern with audit-friendly data
- Log utilization, edits, escalations, and outcomes. Use this to prune prompts and message categories.
- Adopt a risk framework so leaders can tune scope with confidence. See the NIST AI Risk Management Framework here.
Risks to manage
- Review burden: unnecessary drafts steal attention. Treat attention as a budget.
- Tone misfires: support roles may need more warmth; experts may want fewer adjectives.
- Liability: require human sign-off; embed escalation cues in prompts.
- Over-reliance: encourage evidence checks for clinical or sensitive claims. See inbox burden context in JAMA.
Healthcare insight you can apply beyond healthcare
Any team handling sensitive, high-volume inbound messages faces the same trade-offs: speed, accuracy, empathy, and workload. This study shows that value isn't automatic-you earn it by choosing where drafts appear, how they're written, and who they're written for.
Quick checklist
- Trigger drafts only where they help.
- Optimize for brevity + substance.
- Tune tone by role.
- Measure handoffs and cycle time, not just clicks.
- Iterate prompts monthly; retire low-yield categories.
Further learning
If you're building AI-assisted workflows for communications teams and want structured upskilling on prompts, workflows, and governance, explore practical courses at Complete AI Training or browse by job role here.