AI Agent Security: Who Owns What in Operations
Agentic AI is moving fast into daily workflows through platforms like Microsoft and Salesforce. That speed doesn't remove risk. It shifts responsibility.
The right model is shared responsibility. Vendors secure their platforms and deliver guardrails. You secure data, identity, and how agents are allowed to act in your environment.
Ignore that and you court data leakage. A recent example: a vulnerability chain in a popular CRM agent showed how indirect prompt injection plus weak controls could expose sensitive records. Patches help, but architecture and access design are what stop repeat incidents.
What Shared Responsibility Looks Like (In Plain Terms)
- Vendor owns: Platform security, infrastructure patches, isolation between tenants, safe defaults, granular controls, and complete audit logs.
- You own: Data classification, access scopes, secrets management, identity and permissions, monitoring, approval workflows, and incident response.
Data doesn't live in the agent. It lives in your stores. The risk is what you allow the agent to see and do.
Why This Is Hard for Ops
- Agents act on behalf of users and can chain actions. Over-permissioning turns small mistakes into big exposures.
- Prompt injection, ingested secrets, and weak access reviews are common entry points.
- Teams expect vendors to "handle it," then skip architecture and change controls. That's how blind spots form.
Minimum Guardrails to Ship With on Day One
- Create a unique identity for each agent (service principal or app registration). Never reuse a human account.
- Apply least privilege: read-only by default, narrow scopes, time-bound tokens, and explicit tool whitelists.
- Segment data. Use collections, folders, or indexes per team and agent. Deny by default, then allow intentionally.
- Move secrets to a vault. Never place tokens, API keys, or credentials in prompts, examples, or docs.
- Enforce strong MFA for admins and users. Prefer hardware-backed keys like FIDO2.
- Log everything: prompts, tools called, data touched, and outputs delivered. Stream to your SIEM.
- Add an approval step for high-impact actions (payments, bulk updates, mass deletes, data exports).
- Rate-limit sensitive actions and set spend/usage caps.
Architecture Patterns That Reduce Blast Radius
- Sandbox by default: Run agents in isolated environments with separate credentials and data paths.
- Retrieval allowlists: Restrict what indices, tables, and fields an agent can query.
- Output moderation: Scan responses for secrets or sensitive fields before they reach users or systems.
- Human-in-the-loop: Require review for destructive or external-facing actions.
- Kill switch: Central toggle to disable an agent, revoke tokens, and freeze queues instantly.
Controls Vendors Should Provide (Enable Them)
- Mandatory MFA, session policies, IP allowlists, and device checks.
- Granular scopes for tools, data sources, and functions. No "all data" permissions.
- Customer-managed keys and exportable audit logs.
- Prompt and tool usage policies, with configurable blocklists and content filters.
Use vendor features, but don't outsource judgment. Perimeter controls won't fix poor data access design.
Questions to Ask Before You Go Live
- What exact datasets can the agent access today? What needs to change to make that list smaller?
- Which tools can it call, with what parameters, and how are those restricted?
- How are prompts, conversation history, and outputs stored? For how long? Can we segregate or purge?
- What's the containment plan for prompt injection or tool abuse? How do we roll back and revoke quickly?
- Can we export full audit trails to our SIEM and correlate with user identity?
- What rate limits, quotas, and cost controls can we enforce per agent and per user?
Metrics and Alerts That Matter
- New permissions granted to an agent, especially write/delete scopes.
- Data access anomalies: unusual tables, spikes in exports, access outside business hours.
- Tool misuse: repetitive high-risk calls, failed calls, or new tool invocations.
- Prompt injection signals: unexpected outbound requests, instructions to bypass policy, or attempts to read secrets.
- Drift: changes to prompts, tools, or routes without approved change tickets.
Common Failure Modes
- Agents granted blanket access "to move faster."
- Secrets embedded in prompts or examples that later leak.
- Relying on DLP or secrets scanning as the main control instead of fixing access.
- No red-team testing for prompt injection or tool chaining.
- No clear incident plan or rollback path.
Fast Start Checklist for Ops
- Map data flow: source → agent → tools → destinations. Remove anything unnecessary.
- Issue dedicated identities and credentials per agent. Apply least privilege.
- Isolate environments. Separate dev, test, and prod with different datasets and keys.
- Turn on MFA, logging, and export audit to SIEM. Set alerts.
- Define approval steps and kill switch. Test both.
- Run a prompt-injection tabletop and fix any gaps found.
- Document ownership: who approves changes, who monitors, who responds.
Useful Standards and References
Next Step for Your Team
If your operations team needs to skill up on AI safety, deployment patterns, and controls, explore practical training built for job roles: AI courses by job or get hands-on with an AI automation certification.
Bottom line: vendors secure the platform; you secure access and data. Treat agents like any other high-privilege service. Start with least privilege, isolate, log everything, and plan for failure before it happens.
Your membership also unlocks: