AI Chatbots Are Quietly Building the Next Data Walled Garden
AI chatbots have become routine tools for answering questions, drafting emails, and summarising documents. But as they embed deeper into daily work, they're collecting vast amounts of personal information-and most users don't realise how much.
Meta AI, Google Gemini, and ChatGPT routinely gather contact details, search history, browsing data, and user-generated content. Location tracking has jumped sharply: 70% of leading AI chatbot apps now track location, up from 40% a year ago, according to analysis by Surfshark of the 10 most popular AI chatbot apps.
Meta AI collects 33 of 35 possible data types and remains the only app collecting financial information. The scale matters because it signals how aggressively these platforms are extracting user data.
The Cognitive Data Moat
Social media walled gardens locked in behavioural data-likes, shares, clicks. AI creates something different: cognitive data. When you ask an AI chatbot about a business decision or health concern, that reasoning becomes proprietary information locked inside one company's system.
"That's a qualitatively different kind of moat," said Jacky Chan, CTO of Votee AI and Beever AI. "When you confide in an AI assistant about a business decision or a health concern, that insight becomes platform advantage."
For marketers in Asia-Pacific, this creates a specific problem. Brands have already navigated fragmented ecosystems-WeChat, LINE, Grab, Gojek-each with walled data. AI adds another layer. If customer data lives inside one vendor's system and can't be exported, brands lose negotiating power permanently.
Geographic differences matter. Asia has higher comfort with data collection because dominant platforms historically operated with fewer restrictions. Europe enforces stricter standards through GDPR. The practical effect: choosing an AI tool is also choosing whose rules govern customer data.
What Marketers Actually Need to Know
Most marketers don't fully understand how AI agents operate in their workflows. These tools are efficient at tasks existing marketing APIs can't handle-but that efficiency comes from processing private data in ways that remain opaque.
"AI agents generally have no built-in data privacy or security guardrails, and very few have ethical constraints on how they utilise data," said Gary Liu, Terminal 3's co-founder and CEO. "If you are employing AI agents in your marketing workflow, it's incumbent on you to understand what kinds of private data they're collecting and how they're using it."
The legal status is murky. What AI agents do with data often exists in a gray zone regulators haven't addressed. That ambiguity doesn't excuse inaction.
Stop treating data ethics as a compliance checkbox. Treat it as a design constraint. Ask: what data do you actually need to deliver the outcome?
If a CRM campaign's AI tool pulls location, health, and search history data you didn't request, the integration wasn't scoped tightly enough. That's a signal the tool is collecting more than necessary.
Three Practical Steps
Audit data flows. Map what each AI tool in your stack collects. Most marketers can't answer this question.
Apply purpose limitation. Collect only what's strictly necessary to fulfil the user's immediate request. Health history or precise location shouldn't be gathered unless required for the task at hand.
Segregate conversational data. Keep conversation data separate from core marketing systems unless users give explicit, separate consent for future use.
Transparency Isn't a Privacy Policy
Perception matters more than technical reality. Kantar research shows 67% of Boomers and 63% of Gen X see location data as commonly collected by AI tools. When users believe their data is being harvested, trust erodes regardless of what's actually happening.
Publishing a privacy policy nobody reads isn't transparency. Operational transparency means giving users meaningful control, making opt-outs easy, and explaining what the AI does in plain language.
Build disclosure into campaign briefs. Make it a habit, not an afterthought.
The Consent Fatigue Problem
Brands are asking for more granular data than ever. If the value returned is just slightly better retargeting ads, users will-and should-reject it.
When someone shares something sensitive with your AI, they deserve an immediate, tangible benefit. Something that feels like a concierge service, not a data grab. If brands don't hold themselves to that standard, regulators will impose one.
For AI for Marketing professionals navigating these decisions, the stakes are clear: integrate AI thoughtfully or face tightening regulation. The AI Learning Path for Marketing Managers covers the governance and ethical frameworks needed to make these calls.
Your membership also unlocks: