AI Agents Exposed: Why Promises Outpace Reality and the Race to Fix Costly Mistakes Is Just Beginning

AI agents promise efficient customer service but often make critical errors that harm experiences. Tools like Traceloop help detect and fix these issues before they affect customers.

Categorized in: AI News Customer Support
Published on: Jun 08, 2025
AI Agents Exposed: Why Promises Outpace Reality and the Race to Fix Costly Mistakes Is Just Beginning

Promises vs. Reality: AI Agents Still Aren’t Ready for the Real World

AI agents have been touted as a breakthrough in customer service, promising to handle inquiries efficiently and accurately. Yet, many companies find these tools fall short in practice. The risks of incorrect answers, inappropriate discounts, or even harming customer experiences remain a barrier to widespread deployment.

Thousands of organizations invest heavily in AI agent development, but hesitation persists. The core issue: AI agents often make critical errors that companies struggle to detect and correct quickly.

Handling Malfunctioning AI Agents

Traceloop, an Israeli startup, is addressing this challenge by creating technology to catch AI mistakes early and help fix them before they impact customers. The company recently closed a $6.1 million seed round, backed by investors including Sorenson Capital, Ibex Investors, Samsung NEXT, and Y Combinator.

Nir Gazit, Traceloop’s CEO, highlights the gap between expectations and reality: “Companies expect AI to answer every question about their business flawlessly, but it doesn’t work that way.”

He adds, “Initially, it feels impressive. The AI answers some questions well, and excitement builds. But once deployed, the agent often fails because real customer queries are unpredictable.”

Tracking AI agent performance remains surprisingly limited. Most organizations only realize mistakes after collecting data post-deployment. Fixes often rely on trial and error, which can be slow and costly.

Gazit explains, “We help companies understand where their AI is performing well, where it’s failing, and how to improve results.”

Technology Is Still Immature

AI agents don’t behave like traditional software. “Code does what you tell it to do. AI agents, you have to ask nicely and hope for the best. Sometimes they follow instructions, sometimes not,” Gazit says.

This inconsistency underlines how far AI is from delivering on promises of seamless integration with human teams. Gazit is skeptical about full integration happening anytime soon: “I don’t see this happening in the next ten years. It’s closer to science fiction than reality.”

He points out that people often overestimate AI capabilities, building “castles in the air” that don’t match the technology’s current state.

Developers and AI: The New Reality

Contrary to fears that AI will replace programmers, Gazit believes developers who learn to work with AI tools will thrive. “Those who adapt will be better than before. Those who don’t may struggle to keep their jobs.”

He notes that AI-generated code can be identified easily because it often lacks review and rarely works as expected. Even with AI assistance, he spends about 30% of his time writing code himself.

The Goal: AI Agents That Truly Converse

Founded by Gazit and Gal Kleinman in late 2022, Traceloop aims to expand its tools to support AI agents capable of genuine conversation with customers, not just text generation.

Gazit shares that many companies already use large language models (LLMs) internally and externally. For example, a U.S. pension fund uses AI agents to analyze hundreds of thousands of medical documents, demonstrating practical, real-world applications.

Concerns About AI Hype and Safety

Gazit expresses concern about the noise surrounding AI, fueled by easy access and misunderstanding of the technology. “When I built AI models at Google, it was an art form for experts. Now, anyone can use AI, and people don’t realize what they’re handling.”

Regarding big AI companies and safety, he sees much of the discussion as publicity rather than substance. “Talk about AI safety often serves companies wanting to claim they are closest to artificial general intelligence (AGI). It’s mostly noise.”

Have We Reached the AI Ceiling?

Gazit believes that current improvements in AI have plateaued due to limits in available data and model scaling. “GPT-3 succeeded because of size and new technology, but now we’re hitting a ceiling.”

He is open to the possibility of a breakthrough in the future but doesn’t expect AGI anytime soon. “It could happen tomorrow or in 30 years, but not now.”

Verifying AI Agents Is Essential

Aharon Rinberg, partner at Ibex Investors, emphasizes the importance of verification tools for AI agents: “LLMs improve human interaction with data but can be dangerously overconfident. Verification ensures AI agents work as intended.”

Companies like IBM, Cisco, and Dynatrace already use Traceloop’s technology for monitoring AI agents, and adoption of such verification tools is expected to grow faster than the AI models themselves.

What This Means for Customer Support Professionals

AI agents show promise but aren’t yet reliable enough to replace human judgment in customer interactions. Awareness of their limitations is crucial for support teams considering AI tools.

Monitoring and verification technologies, like those from Traceloop, can help catch errors early and reduce risks. Meanwhile, customer support professionals who develop skills to work alongside AI will be better prepared for the evolving landscape.

If you want to deepen your knowledge on AI tools and how they can assist in customer service roles, consider exploring practical AI courses at Complete AI Training.