Maharashtra, IIT Bombay test AI for language-based identity screening
Published: Jan 24, 2026, 11:06 AM IST
The Maharashtra government is working with IIT Bombay on a language-based verification tool to help law enforcement with preliminary screening of suspected illegal Bangladeshi nationals and Rohingyas. The project, led by the state IT department with a budget of ₹3 crore, analyzes speech patterns, tone, and word choices as an early signal before any document-led investigation.
Chief minister Devendra Fadnavis said at the Mahayuti manifesto launch on January 11: "We'll free Mumbai from Bangladeshis. We've deported the highest so far. With AI, we'll identify and deport 100% Bangladeshis." He also said a detention centre has been set up to hold such persons before deportation.
What the tool does (and doesn't)
The system aims to flag linguistic markers that may distinguish Bangladeshi nationals from Bengali-speaking residents of West Bengal. Officials describe it as a screening aid, not a conclusive test of nationality. According to the state, current reliability is about 60% in experiments conducted over the last three months, with expectations of major improvement over the next six months.
Any flag from the tool would be followed by standard police procedures and document checks. Enforcement decisions must rest on documentary verification and due process, not an accent profile.
Linguistic overlap: high risk of false positives
Bangla spoken in West Bengal and Bangladesh sits on a shared dialect continuum, with overlapping accents, vocabulary, and pronunciation. This is especially pronounced in border districts such as North 24 Parganas, Nadia, Murshidabad, and Malda, where speech patterns naturally blend.
That overlap raises a practical risk: citizens of West Bengal may "sound Bangladeshi" to non-specialists. Agencies should treat any AI output as a lead, not a label. For context on language variation, see Bengali language overview.
How agencies could use it responsibly
- Use as triage only: Treat AI flags as prompts for further checks, never as evidence of nationality.
- Pair with documents: Proceed to verification via established procedures and records under the Ministry of Home Affairs (MHA).
- Set clear thresholds: Define accuracy and confidence levels required before any escalation.
- Bias testing: Independently audit for disparate impact on West Bengal residents and other Bengali-speaking communities.
- Operator training: Train personnel on limitations, false positive handling, and respectful engagement.
- Documentation: Log all screenings, model versions, and decisions for accountability and review.
- Data safeguards: Protect voice data with strict retention limits, access controls, and legal basis for collection.
- Redress mechanism: Ensure a clear, fast process to contest and correct erroneous flags.
- Human-in-the-loop: Require senior review before any action beyond routine questioning.
Current status and next steps
Officials say the tool is still experimental with about 60% reliability. The chief minister stated it could be "foolproof" in six months, though such targets should be validated in independent trials before any wide deployment.
The government has set up a detention centre while reiterating its focus on deportation after verification. Authorities have also claimed some infiltrators first enter West Bengal, obtain forged documents, and then move to other states; that requires deeper inter-state coordination and document forensics.
Questions to settle before a pilot
- What are the training data sources, and do they represent the full range of West Bengal and Bangladesh dialects?
- What legal framework governs voice collection, storage, and cross-agency sharing?
- What is the minimum validated accuracy by district and dialect, and how will drift be monitored?
- What is the escalation protocol when documents contradict AI output?
- Who audits the model (internal and external), how often, and what is publicly reported?
- What are the penalties for misuse or action taken solely on AI output?
For government teams
If your department plans to evaluate or procure similar tools, consider a limited-scope pilot with strict guardrails, third-party audits, and public-facing SOPs. Build capacity across legal, linguistic, and technical teams before scale-up.
For structured AI upskilling plans across roles, you can review curated options here: Complete AI Training - Courses by Job.
Your membership also unlocks: