FriendliAI secures $20M to boost AI inference speed and cut costs for developers

FriendliAI raised $20M to speed up AI inference and cut costs by up to 90%. Its continuous batching boosts large language model throughput over tenfold.

FriendliAI Raises $20M to Boost AI Inference Workloads

FriendliAI Corp., a startup focused on speeding up AI model inference, has secured $20 million in funding. Capstone Partners led this seed extension round, with participation from Sierra Ventures, Alumni Ventures, KDB, and KB Securities. This follows the company's initial $5 million raise back in 2021.

Cutting Inference Costs and Improving Speed

The company offers a software platform called the Friendli Engine, which can reduce inference costs by up to 90% while also improving AI response times. The engine achieves these gains through low-level optimizations applied directly to customers' AI workloads.

Large language models (LLMs) typically process user requests in batches. When one request finishes earlier than others in the same batch, the results are delayed until all prompts are complete. This batching process can slow down response times significantly.

Continuous Batching: Faster Processing Without Delay

FriendliAI addresses this problem with a technique called continuous batching. This method changes the order in which inference requests are processed, minimizing unnecessary wait times. The company claims continuous batching can increase LLM throughput by more than ten times in some cases.

In addition, FriendliAI recently added support for N-gram speculative decoding. This technique allows LLMs to reuse data from previous prompt responses when generating new outputs, improving efficiency compared to generating everything from scratch.

Three Offerings for Different Needs

FriendliAI commercializes its technology through three main products:

Friendli Container: Enables organizations to run FriendliAI’s software on private GPU clusters.
Cloud Service for Open-Source Models: Offers inference capabilities without the need for customers to maintain infrastructure.
Friendli Dedicated Endpoints: Supports custom LLMs and automatically adjusts GPU allocation based on workload demands.

Growing Client Base and Future Plans

According to Crunchbase, FriendliAI currently serves between 25 and 30 large clients. These customers are expected to help the company increase revenue by as much as 600% this year. Although not yet profitable, FriendliAI maintains strong gross margins.

The new funding will support expanding go-to-market initiatives in North America and Asia. The company also plans to enhance its inference software and acquire additional GPUs for its cloud services.

For developers and IT professionals interested in AI acceleration and inference optimization, staying informed about these advancements is key. You can explore relevant AI courses and resources at Complete AI Training.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

FriendliAI secures $20M to boost AI inference speed and cut costs for developers

FriendliAI Raises $20M to Boost AI Inference Workloads

Cutting Inference Costs and Improving Speed

Continuous Batching: Faster Processing Without Delay

Three Offerings for Different Needs

Growing Client Base and Future Plans

Related AI News for IT and Development

Where AI Falls Short for Global Development-and Why Humans Still Matter

Japan's AI Act Now in Force: Promoting Innovation While Keeping Risks in Check

Confluent Intelligence brings real-time context to AI agents, adds private cloud and Databricks integrations

From deadline to advantage: a smarter Windows 11 refresh with Compugen, HP, and Microsoft

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: