Featherless.ai, a Singapore-based serverless AI inference platform, has raised $20M in Series A funding co-led by AMD Ventures and Airbus Ventures. The platform provides instant API access to over 30,000 open-weight LLMs through GPU orchestration and model load-balancing. The capital will scale infrastructure, launch a model marketplace, and deepen hardware integrations.
Inference Workloads Dominate AI Spend
The round arrives as inference workloads eclipse training in AI compute demands, with the AI inference market reaching $118B at a 19.2% CAGR. Competitors like Replicate ($58M) focus on pay-per-second custom deployments, while Baseten ($585M) emphasizes enterprise orchestration. Together AI ($500M+) and Fireworks AI ($300M+) support training alongside inference. Featherless.ai differentiates with its vast catalog and flat-rate unlimited token pricing.
Per-Token Costs Spiral Unpredictably
Open-source LLMs proliferate on Hugging Face, yet per-token pricing creates cost volatility, with output tokens costing 3.74x input and long-context models 3.1x more per Asia Business Outlook coverage. Developers and enterprises face infrastructure management burdens alongside GPU shortages. Current solutions tie users to proprietary ecosystems, limiting hardware choice and data sovereignty.
Serverless Access to 30,000+ Models
Featherless.ai offers a single API for chat, completions, tool calling, vision, and embeddings across 30,000+ models, hosted in US and Europe data centers for localization. Its serverless architecture handles load-balancing without user-managed infra, delivering 10x inference cost reductions. Unlike Replicate's smaller catalog or Fireworks AI's speed-focused but costlier scaling, Featherless.ai provides flat-rate subscriptions for predictable budgeting.
Flat Pricing Unlocks Enterprise Scale
Managed runtimes support open-source AI agents like OpenClaw and Hermes, with OpenAI-compatible APIs ensuring seamless migration. Customers including Meta, YouTube, VMware, and Cisco achieve 30% MoM ARR growth. The platform's RWKV roots enable efficient non-transformer inference, 1000x cheaper for certain models.
Strategic Backers Signal Hardware Independence
Co-leads AMD Ventures and Airbus Ventures, joined by BMW i Ventures, Kickstart Ventures, Panache Ventures, and Wavemaker Ventures, validate Featherless.ai's push for AI sovereignty. Investors highlight reducing reliance on hyperscalers and promoting hardware diversity via AMD ROCm collaboration per Tech.eu. This follows a $5M seed from Airbus Ventures, signaling conviction in open infrastructure.
As Sagi Paz of AMD Ventures noted:
“Featherless.ai is at the forefront of a critical new phase…"
Open Models Fuel $118B Inference Surge
The $118B AI inference market grows at 19.2% CAGR, driven by open LLMs and agentic AI demands. Featherless.ai positions as the largest Hugging Face inference provider with 6,700+ models integrated deeply. Trends favor serverless platforms amid cost pressures and compute shifts.
RWKV Creators Build Neutral Layer
Founders Eugene Cheah (CEO), Harrison Vanderbyl (CTO), and Wesley George (COO) co-led RWKV, a Linux Foundation project for attention-free RNNs optimized for inference. This expertise powers the platform's efficiency, with Cheah's Recursal AI background targeting 'transformer killers' per leadership research. The team's RWKV synergy addresses scalable open AI needs.
Post-Funding Infra and Marketplace Push
Featherless.ai plans infrastructure expansion across regions, a specialized models marketplace, and enhanced chip support. Active hiring for engineering, DevOps, and business development roles supports this scale-up, building on hackathon wins and enterprise traction like millions of inferences on top models.
