Most Malaysian AI customer service chatbots fail basic language comprehension, study finds

Two-thirds of Malaysian chatbots fail basic comprehension tests, forcing customers to re-explain problems every time they escalate to a human agent. Financial services scored worst at 38.3% average.

Categorized in: AI News Customer Support
Published on: Jun 11, 2026
Most Malaysian AI customer service chatbots fail basic language comprehension, study finds

Most Malaysian Chatbots Fail Basic Comprehension, Study Finds

Two-thirds of chatbots serving Malaysian customers cannot understand how people actually talk. That's the finding from Entermind's Enterprise Chatbot Quality Index, which tested 24 chatbots across e-commerce, travel, telecom, and financial services.

The research exposed a pattern: users escalate to human agents repeatedly, only to re-explain their problems from scratch. One e-commerce user reported that 100% of their chatbot interactions required escalation.

What the Test Measured

Entermind's evaluators assessed each chatbot against 26 standardized tests across five categories: Comprehension, Access, Experience, Functional Capability, and Safeguards.

Comprehension emerged as the sharpest differentiator. Of the chatbots tested, 78% that passed at least four of the seven Comprehension tests scored above 60% overall. The inverse held too: all six triage bots passed Access tests but failed Comprehension.

Who's Performing

Touch 'n Go topped the e-commerce and e-wallet category at 82.3%, passing all seven Comprehension tests. The platform handles slang, topic changes, multilingual support, and negation-the kinds of conversational shifts that trip up most chatbots.

Boost scored 74% by taking a different approach: it passed all seven Safeguards tests, including manipulation resistance and error recovery.

Shopee reached 64.5%, while Lazada scored 60.7%. Both had stronger Comprehension scores than their overall results suggest, but Lazada's Safeguards performance was weak at 2/7, with no fallback loop or error handling.

Financial Services performed worst, with an average of 38.3%. Ryt Bank was the only genuine exception, scoring 7/7 on Safeguards and processing 80,000+ transactions monthly through natural language with a hallucination rate below 1.5%.

Maybank and UOB, despite sophisticated digital apps, operated as triage layers with minimal comprehension capabilities. Maybank offered only preset FAQ menus with no free-text input.

The Localization Gap

Only eight of 24 chatbots passed both language and slang tests: Maxis, CelcomDigi, Ryt Bank, AirAsia, Batik Air, Boost, Touch 'n Go, and Shopee.

This matters because chatbots that cannot recognize colloquial language force users into rigid menu pathways or repetitive phrasing. The friction compounds when users must rephrase questions in formal English.

The Missing Feature

None of the 24 chatbots demonstrated cross-session memory. Each conversation starts fresh, treating returning users as strangers.

Top performers focused on three areas: language understanding, conversation handling, and safety. The whitepaper identifies three priorities for improvement: investing in language understanding, fixing fallback and error handling, and building cross-session memory.

For customer support teams, the implication is clear. A chatbot's value depends less on its ability to answer questions and more on its ability to understand what customers actually ask.

Learn more: AI for Customer Support and Generative AI and LLM fundamentals.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)