Scenarios, Not Foundation Models: Huolala's CTO on AI that actually works

AI wins come from scenarios, not homegrown base models. Huolala built internal platforms that cut risks and SMS costs, boosted ASR quality, and delivered 5-10% efficiency gains.

Categorized in: AI News Product Development
Published on: Nov 29, 2025
Scenarios, Not Foundation Models: Huolala's CTO on AI that actually works

Winning with AI Comes from Scenarios, Not Base Models

Ask a product team what makes AI useful, and you'll hear it: the scenario. That's the core of Zhang Hao's message as CTO of Huolala. Base models are public utilities; your advantage is how you apply them to your data, processes, and user moments.

Huolala operates in over 400 cities with nearly 20 million monthly active users and 2 million active drivers. The business is simple to explain and hard to execute: match shippers and drivers faster, safer, and with less friction. That puts two targets at the center-operational efficiency and user experience.

How They Prioritized AI (and What They Skipped)

Two years ago, the team assessed where AI would move the needle most. They borrowed a practical approach: job surveys, task breakdowns, and automation difficulty ratings, similar to the method discussed in the 2023 Goldman Sachs AI analysis. High data density and labor-heavy tasks got priority; high-certainty analytics waited their turn.

More importantly, they stopped chasing a proprietary foundational model. The pace of base model progress outstrips most in-house efforts. Instead, they doubled down on three assets: digital data, business APIs, and institutional know-how.

That decision led to a different bet: build an internal AI application platform so every improvement in public models instantly boosts outcomes across their stack.

The Three Internal Platforms

  • Wukong Platform: Lets non-technical users assemble intelligent agents in minutes.
    • Visual process orchestration to connect company APIs and data assets.
    • Zero-code agent creation via natural language.
    • Enterprise tool library and MCP compatibility to standardize capabilities (Model Context Protocol).
  • Dolphin Platform: One-stop workflow for ML teams-from data prep and training to deployment and lifecycle management. The goal: reduce overhead so algorithm engineers spend time on models, not plumbing.
  • Evaluation & Annotation Platform: AB testing and model PK with tight segmentation plus the Lala Intelligence Evaluation system. Good launches need repeatable, audited outcomes; this platform makes that real.

Application Scenarios That Actually Shipped

  • AI security prevention and control: Real-time detection of illegal passenger-carrying, dangerous goods, and risky driving using voice, images, and unstructured signals. Short intervention windows demand fast decisions and high accuracy.
  • AI coding in R&D: Now used by ~90% of people and teams; ~60% penetration across the product-to-deploy pipeline. Net throughput gain sits around 10% given verification and testing overhead. Strong for new projects and front-end work; complex business logic still needs human steering.
  • "Take a Photo to Select a Vehicle": Point-cloud segmentation estimates cargo volume from a single photo and matches the right vehicle within ~10 seconds. Solves uncertainty for first-time or infrequent shippers.
  • User feedback analyzer: A small model handles fast classification; an LLM summarizes patterns. Example: surfaced invoice issuance inefficiency quickly-previously easy to miss.
  • AI product knowledge expert: Pulls from PRDs, repos, configs, and more to answer "who/why/how" behind features. Reduces knowledge blind spots across departments.
  • SMS content optimization: LLM rewrites shorter, clearer messages and pre-checks for compliance risks. Result: ~12% annual cost reduction and fewer brand risks at scale.
  • AI digital human business partner: ASR → LLM → TTS pipeline with hot-words and acoustic model tuning. Dialect-aware voice increased perceived trust and naturalness. Metrics: 94% semantic ASR accuracy, ~92% human-likeness.
  • Emotion-aware support: Question rewriting, scenario routing, and a multi-agent setup improved resolution rates and accuracy for anxious or angry users.

What Moved the Numbers

  • Risky orders (dangerous goods, illegal passenger-carrying): down ~30%.
  • Order reminder coverage in safety workflows: ~100%.
  • AI coding: ~10% net efficiency gain; ~60% pipeline penetration.
  • SMS costs: ~12% annual savings.
  • ASR semantic accuracy: ~94%; perceived human-likeness: ~92%.

The broader takeaway: in service-heavy O2O businesses, AI's average effect is a modest 5-10% efficiency gain. Some roles see larger shifts, but the consistent wins are cost reduction, risk prevention, and smoother operations.

Practical Playbook for Product Teams

  • Start with a work map: Job × task × error tolerance × data density. Prioritize high-volume, high-friction workflows. A similar framing is discussed in the generative AI research by Goldman Sachs (link).
  • Treat base models as interchangeable parts: Your moat is data, APIs, and process know-how. Build an application layer and tool registry (MCP-compatible) that survives model swaps.
  • Institutionalize evaluation: Gold datasets, adversarial tests, and online AB as first-class citizens. Every release should be explainable and repeatable.
  • Optimize for latency and accuracy: If you chain ASR → LLM → TTS, measure end-to-end. Consider end-to-end multimodal models as they mature to cut hops and drift.
  • Quantify wins that matter to P&L: Incident rate, time-to-resolution, cost per message, % AI-generated code deployed, safety reminders delivered, and CSAT shifts.
  • Design for trust: Dialect, prosody, and context memory matter in voice. Small changes in tone can lift perceived credibility.
  • Plan orchestration: One "digital human" is useful; many, coordinated across upstream to downstream, is where compounding gains show up.

What's Next

Base models keep improving, and that alone will lift well-built applications. Huolala's path points to a near-term focus: multimodal, end-to-end pipelines to reduce latency and error surfaces, plus orchestration of multiple agents across the full process.

If your team is upskilling across roles (PM, Eng, Ops), a curated set of AI learning paths by job can help standardize the baseline across functions: Courses by job.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide
🎉 Black Friday Deal! Get 86% OFF - Limited Time Only!
Claim Deal →