Salesforce Study Finds AI Agents Struggle With Complex Business Tasks, Failing 65% of Multi-Turn Interactions

A Salesforce AI study finds enterprise AI agents fail 65% of multiturn CRM tasks, dropping from 58% success in single-turn tasks. Multi-turn interactions remain a major challenge.

Categorized in: AI News Customer Support Sales

Published on: Jun 30, 2025

AI Salesforce Study Finds Enterprise AI Agents Fail 65% of Multiturn Tasks

A recent benchmark study from Salesforce AI Research reveals that leading AI agents struggle significantly with complex business tasks. While these agents achieve a 58% success rate in single-turn customer relationship management (CRM) tasks, their performance drops sharply to 35% in multi-turn interactions. This gap highlights the challenges AI faces in real-world enterprise environments involving customer service, sales, and pricing workflows.

What the Study Shows

Salesforce AI Research introduced CRMArena-Pro, a benchmark designed to evaluate AI agent capabilities across 19 distinct CRM tasks. It covers both Business-to-Business (B2B) and Business-to-Consumer (B2C) scenarios, involving over 83,000 synthetic records validated by CRM professionals for realism.

The study assessed key skills such as database querying, numerical computation, information retrieval, workflow execution, and policy compliance. Among these, workflow execution was the easiest for AI agents, with success rates above 83% in single-turn tasks. However, confidentiality awareness was a major weakness, with agents showing almost no inherent understanding of sensitive information handling unless specifically prompted—though this came at the cost of task accuracy.

Performance Across Models and Tasks

Leading AI models tested included OpenAI’s o1 and GPT-4o, Google’s Gemini-2.5-Pro and Gemini-2.5-Flash, and Meta’s LLaMA series.
Models designed with stronger reasoning capabilities outperformed others by 12-20% in task completion.
All models showed steep performance declines when shifting from single-turn tasks to multi-turn dialogues, often failing to obtain necessary information through clarification.
About 45% of the failures were due to incomplete information gathering during multi-turn interactions.

Cost-efficiency analysis indicated Google’s Gemini-2.5 models offered the best balance of performance and expense. OpenAI’s o1 performed well but at a considerably higher cost, which may limit its enterprise adoption.

Implications for Customer Support and Sales Teams

For professionals working with CRM systems, these findings highlight that current AI agents are not yet reliable enough to fully automate complex, multi-step business workflows—especially those involving sensitive customer data. The trade-off between confidentiality and task success suggests caution when deploying AI for tasks that require strict data privacy.

Multi-turn interactions remain a particular challenge, with AI agents often failing to ask clarifying questions or gather all needed information. This limits their usefulness in dynamic customer support or sales scenarios where conversations naturally evolve over multiple exchanges.

What’s Next for Enterprise AI Agents?

The research points to the need for improved AI tools with better reasoning and collaborative capabilities. Approaches like “agent chaining,” where specialized AI agents work together on complex tasks, could help overcome current limitations.

Those interested in enhancing AI skills for business applications may find value in specialized courses on Complete AI Training, which offer practical insights tailored for customer support and sales roles.

Key Dates

May 24, 2025: Research paper submitted to arXiv
June 10, 2025: Public release of CRMArena-Pro benchmark and study findings
June 11, 2025: Broader industry discussions highlight enterprise AI limitations

This study provides a clear snapshot of where AI stands in handling complex CRM tasks today. While AI agents show promise, there’s still a long road ahead before they can reliably support multi-turn business workflows in customer service and sales.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Salesforce Study Finds AI Agents Struggle With Complex Business Tasks, Failing 65% of Multi-Turn Interactions

AI Salesforce Study Finds Enterprise AI Agents Fail 65% of Multiturn Tasks

What the Study Shows

Performance Across Models and Tasks

Implications for Customer Support and Sales Teams

What’s Next for Enterprise AI Agents?

Key Dates

Related AI News for Sales

Clear CTAs, Clean Copy, Real Humans: 1up's Smarter Way to Use AI in Sales Emails

Can AI Make Sales Feel More Human? Lauren Goodell's Zinnia Is Betting on It

Pearson (PSON): Sales and profit up, cash flow stronger, AI growth set to continue into 2026

More AI, Less Clarity: CaptivateIQ's 2026 State of Sales Exposes Quota Delays and Costly Commission Errors

Related AI News for Customer Support

Telekom CoMind puts conversational AI to work for Europe, lifting Phoenix Pharma service levels from 60% to 98%+

Trump orders agencies to drop Anthropic as fight over military access to AI escalates

Zoom Virtual Agent 3.0 targets first-contact resolution with end-to-end automation and smarter handoffs

Shipped in weeks with Copilot Studio: Ask Microsoft helps customers find answers faster and scales across Microsoft.com

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: