Trade Secrets Leave Organizations Through Employee AI Use, Not Just Training Data
Organizations face an immediate and widespread trade secret exposure problem that occurs long before AI systems receive formal training data. Employees paste source code, design documents, and technical specifications into AI tools during routine work-debugging errors, summarizing materials, drafting under time pressure-often without authorization or documentation.
About four in 10 employees report entering sensitive workplace information into AI tools without employer approval, according to Security Management. Once that information leaves controlled channels, it rarely comes back.
The Deletion Problem
Many organizations assume confidential material can be removed from an AI system after disclosure. Current technology makes that assumption unreliable.
Large language models blend patterns from inputs into complex internal structures: model parameters, embeddings, and memory systems. Isolating and deleting a single company's information may require dismantling the system itself. Researchers are exploring machine unlearning and model editing, but these approaches remain experimental. Recent work shows supposedly "unlearned" content can sometimes be partially recovered.
Providers can suppress outputs with filters, but suppression is not deletion. The closest actual remedy is retraining or rebuilding the system on clean data. For frontier models, that cost runs into tens or hundreds of millions of dollars. The Stanford HAI AI Index estimates training costs of roughly $78 million for GPT-4 and $191 million for Gemini Ultra.
AI Agents Amplify the Risk
Workflow-integrated AI systems and agents pose a distinct problem. Unlike a single prompt, many agent systems retain memory, logs, embeddings, or intermediate summaries across sessions and connected tools. Information disclosed once can be reused in later outputs, propagated to other systems, or incorporated into downstream processes without the user's knowledge.
From a legal standpoint, secrecy is lost through retention, reuse, or redistribution of information-even if the data never enters a formal training dataset.
Courts Treat Contamination as Irreparable
Courts have recognized similar problems in traditional technology disputes. In Waymo LLC v. Uber Technologies, Inc., the court focused on whether autonomous-vehicle design information had been integrated into Uber's engineering processes. Once incorporated into a development pipeline, the information could influence technical decisions in ways that could not be reliably isolated or reversed. The court treated that contamination as irreparable harm.
The Federal Trade Commission took a similar position in FTC v. Everalbum, Inc., requiring deletion not only of improperly obtained biometric data but also of the AI models trained on that data.
Traditional Agreements Fall Short
Standard nondisclosure agreements assume limited disclosure in controlled settings. AI agents undermine that assumption. A single paste of proprietary code into an AI-enabled workflow can later appear in outputs, be routed into other tools, or be reused across prompts. What feels like one disclosure becomes many disclosures across a toolchain.
Takedown requests and formal notices to AI providers still matter and should be handled promptly. A well-prepared notice can limit further dissemination and preserve contractual and statutory remedies. But takedowns rarely restore confidentiality once information leaves controlled channels.
Three-Part Protection Strategy
Because trade secret dissemination may be nearly impossible to undo, the most practical approach rests on three pillars:
- Strong front-end controls to reduce disclosure risks
- A structured response plan to contain dissemination and document protective efforts
- Early engagement with experienced counsel to align governance, contracts, and remediation strategies with evolving AI systems
Governance at the Front End
Companies should treat public AI systems as untrusted for sensitive material and prohibit entering source code, specifications, or design documents into public chatbots. Disclosure also occurs inadvertently through embedded AI features in software such as grammar assistants, document editors, code completion tools, and collaboration platforms.
Agreements with employees, contractors, licensees, and AI vendors should address AI use explicitly. Policies should cover agent memory, logs, embeddings, and tool integrations-not just "training data."
Where workflows involve persistent agents, organizations should prohibit feeding confidential material into systems that retain prompts or reuse content across sessions unless the company controls the environment and can enforce deletion and auditability.
For legal teams overseeing AI governance, see resources on AI for Legal and AI Learning Path for Paralegals, which address document review, confidentiality controls, and system governance in professional contexts.
Your membership also unlocks: