Voice AI startup Vapi this week announced a $50 million Series B funding round, disclosing that its platform has now processed more than 1 billion calls and achieved 10x growth in annual recurring revenue from enterprise customers. The round was led by Peak XV and included participation from M12 (Microsoft's Venture Fund), Kleiner Perkins, and Bessemer Venture Partners, alongside earlier investors. Total funding since the company's founding now stands at $72 million.
The announcement, made on May 12, 2026, from San Francisco, marks one of the larger dedicated voice AI infrastructure raises of the year and arrives as the broader enterprise market for agentic AI tools accelerates sharply. Agentic AI infrastructure has moved from conference agenda to live product across multiple industries, with advertising platforms, legal technology companies, and customer experience vendors all deploying autonomous AI systems in production environments.
The platform and its architecture
Vapi describes itself as an API-native enterprise voice AI platform for building, deploying, and managing voice agents at scale. The technical design is built around low latency as a primary constraint. According to Vapi, the platform delivers responses in under 500 milliseconds, which the company frames as the threshold for natural, real-time conversation.
The platform supports more than 200 built-in model integrations and also allows operators to bring their own API keys for transcription, large language model inference, and text-to-speech - or to connect self-hosted models. This modular approach means teams are not locked to a single model provider. Operators can swap models, voices, and conversation logic independently. Automatic failover and latency balancing across providers are built into the architecture, so a degraded model endpoint does not break a live call.
Telephony control is handled natively. According to Vapi, the platform supports custom SIP configurations, warm call transfers, voicemail detection, phone number management, and DTMF input - capabilities that typically require deep integration work when built from scratch. The company's stated goal is to remove the need for engineering teams to understand telephony internals, treating the phone layer as infrastructure rather than a problem the developer must solve.
Multi-agent orchestration is also part of the stack. Teams can chain multiple agents together or configure "squads" - groups of specialized agents with intelligent handoffs and routing between them. This becomes relevant in workflows where a single agent cannot handle the full scope of a call, such as a financial services inquiry that begins with identity verification, moves to account lookup, and then escalates to a live specialist.
Tool calling is supported natively, including integration with external MCP (Model Context Protocol) servers. As MCP has spread across the advertising technology and enterprise software ecosystem throughout 2025 and into 2026, the ability to connect voice agents to MCP-backed data sources adds a layer of interoperability that was not previously standard in voice AI infrastructure.
A/B testing, automated evaluation suites, call-level observability, and hallucination detection before production deployment round out the feature set. According to Vapi, teams can design test suites to identify hallucination risk before an agent goes live, which is particularly important in regulated industries where a factually incorrect response to a customer carries compliance risk.
Enterprise customers and use cases
According to Vapi, the platform currently has more than 1 million developers using it, with over 2.7 million unique agents created across those accounts. Enterprise customers include Amazon Ring, Kavak, ServiceTitan, New York Life, and Intuit. The strongest reported traction is in financial services, healthcare, insurance, automotive, and workforce management - sectors where phone calls remain a primary customer interaction channel and where the cost of a poor call experience is measurable.
The Amazon Ring deployment is the most detailed case study in the announcement. Amazon Ring uses Vapi to handle inbound customer inquiries about smart home security devices. According to Jason Mitura, Vice President of Software Development at Amazon Ring, "When [Amazon] Ring customers call in, they expect fast, high-quality support. After evaluating dozens of vendors, Vapi stood out. We went from zero to production in two weeks, and 100% of our inbound volume now runs through the Vapi. Most importantly, we've maintained our high bar of support for our customers and CSAT scores have improved. Vapi gives our teams the ability to tune the agent experience without depending on engineering. A lot of AI tools promise great outcomes - Vapi has delivered on them."
The two-week production timeline is notable. Enterprise software deployments in telephony and contact center infrastructure have historically required months of integration work. The claim reflects the company's positioning as a platform that removes infrastructure complexity rather than adding to it.
Kavak, a used car marketplace operating primarily in Mexico and Latin America, represents a different use case. According to Alejandro Maza, Chief Product & AI Officer at Kavak, "We reached profitability in Mexico due to this transformation. We're serving twice the customers we were before - and we did this while focusing on the most important part that was the level of service and the experience that our customers were having."
The company reports that its strongest deployment categories include inbound customer service, outbound collections, candidate screening for workforce management, sales coaching through simulated dialogue, and autonomous IVR navigation - replacing traditional interactive voice response trees with agents that can follow the conversation rather than forcing callers into predetermined branches.
The funding round and investor rationale
The Series B was led by Peak XV, the venture firm formerly known as Sequoia India & Southeast Asia. According to Arnav Sahu, Partner at Peak XV, "Vapi has built a differentiated self-serve product for developers and enterprises in the massive voiceAI revolution. In 10 years, it's likely most calls will not have a human behind the phone. With its bottom-up, PLG approach, we believe Vapi is the next Zapier and n8n for voiceAI workflows. At Peak XV, we are investors in several developer and bottom-up companies like Supabase, PostHog, Better Auth and ClickHouse and believe Vapi has the potential to be the defining platform for voice AI. We are excited to partner with them."
The investor framing - comparing Vapi to Zapier and n8n - is revealing. Both Zapier and n8n are workflow automation platforms that succeeded by offering developer-accessible, modular infrastructure that non-technical users could also operate. The analogy suggests Peak XV views the voice AI market as following a similar adoption path: developers build first, then business teams follow.
M12, Microsoft's venture arm, is among the participants. Microsoft's presence in the round is consistent with the broader pattern of enterprise software infrastructure companies seeking exposure to voice AI infrastructure before the market consolidates around a small number of dominant platforms.
The company's origins
Vapi was co-founded by Jordan Dearsley and Nikhil Gupta, who met at the University of Waterloo. According to Vapi, the two spent years building products together, including a Y Combinator-backed calendar application that reached profitability. The voice AI platform began almost by accident. In mid-2023, Dearsley built a voice-based AI therapist for personal use during daily walks, chaining models together and optimizing for latency until he had a working phone-based system. The therapy product did not gain traction commercially, but the underlying infrastructure attracted developer attention. Vapi launched publicly on Product Hunt in March 2024.
According to Jordan Dearsley, CEO and co-founder of Vapi, "Most businesses have spent decades of time and effort, only to make their customer experience worse. The real unlock is building agents for your customers that feel human. Vapi gives teams the platform to deploy voice agents that actually solve problems for customers - millions of them, every day."
Market context: why satisfaction scores haven't improved
Vapi's announcement frames the opportunity around a problem that existing investment has failed to solve. According to Vapi, customer satisfaction scores have dropped by 2% since 2022 and have not meaningfully moved since 2017, despite years of investment in chatbots, automation, and self-service portals. The company cites a projection that nearly $3 trillion in global sales are at risk in 2026 due to bad customer experiences.
The claim echoes a structural critique of how most customer interaction systems were designed. Traditional IVR systems rely on rigid scripts and deterministic routing that cannot adapt to an unexpected customer statement. Chatbots handle text at scale but often fail when the complexity of a request exceeds the bot's training. The phone channel, where customer intent is typically highest and frustration with poor experiences most acute, has seen the least meaningful technical improvement relative to the investment directed at digital channels.
Voice AI as a category sits at the intersection of several converging capabilities - large language model reasoning, low-latency streaming audio, real-time transcription, and text-to-speech synthesis - that have each improved substantially since 2022. Vapi's platform abstracts those components into a single API surface, which is why the developer-first approach has attracted enterprise adoption. Teams that previously needed to build and maintain integrations across five or six separate vendors can now address the problem through a single endpoint.
The interoperability question remains open. The W3C's February 2026 workshop on smart voice agents identified eight unresolved cross-cutting issues including fragmentation, privacy, hallucination control, and accessibility. Proprietary platforms like Vapi can move quickly, but the absence of shared standards for voice agent interoperability creates lock-in risk for enterprise operators - a tension the W3C workshop participants explicitly named.
What comes next
According to Vapi, the next phase of development is focused on governance and predictability rather than capability expansion alone. As voice agents move into higher-stakes workflows, enterprise operators need tighter uptime guarantees, predictable latency under load, and call-level monitoring that treats every conversation as a production workload. The company describes its roadmap in three areas.
The first is deeper reliability - contractual uptime and performance guarantees with reserved capacity sized to a customer's call volume. The second is stronger guardrails that keep agents within defined operational boundaries, with clear escalation paths when a situation requires a human. The third is continuous learning from real conversations to improve resolution rates, task completion, and business outcomes over time - measured at the call level rather than in aggregate.
The governance framing aligns with a broader pattern visible across agentic AI deployments. UK regulators mapped the risks of agentic AI systems in March 2026, noting that autonomous systems operating in customer-facing roles carry specific risks around algorithmic collusion, action bundling, and consumer rights. The Financial Conduct Authority and the Information Commissioner's Office were both among the four UK watchdogs that published that assessment. Vapi's emphasis on escalation paths and guardrails addresses the same concern from the platform side: operators need to be able to define and enforce the boundaries of what a voice agent is permitted to do without escalating.
The company is also positioned in a developer market that is growing quickly. The broader agentic AI infrastructure buildout that PPC Land has tracked across advertising platforms, commerce media tools, and enterprise software has consistently found that developer-accessible platforms with modular architectures attract adoption faster than closed systems. Vapi's API-first architecture, combined with its support for MCP-based tool calling and its model-agnostic design, places it within that category.
Multilingual support - covering English, Spanish, Mandarin, and more than 100 other languages according to the company's documentation - extends the addressable market beyond English-speaking enterprise customers. Reporting and analytics dashboards, with export capabilities for tracking performance, usage, and conversation quality across deployments, give operations teams the visibility they need to manage voice agents at scale rather than treating them as black-box systems.
Significance for marketing and advertising professionals
For professionals working in marketing technology, customer acquisition, and campaign attribution, Vapi's announcement matters for several reasons. Customer phone calls remain one of the highest-intent interactions a business receives. Caller behavior - the questions asked, the objections raised, the information provided before a sale - is rich signal that most marketing stacks cannot capture or act on in real time.
Voice AI platforms that can handle inbound calls at scale, route conversations intelligently, and log structured data about each call create a new layer of first-party data that is directly attributable to specific marketing actions. An outbound voice agent running post-purchase follow-up, a candidate screening agent triggered by a job application form submission, or an inbound support agent handling queries routed from a paid search campaign - all of these generate call-level data that can feed back into campaign optimization and audience segmentation.
The OECD's analysis of agentic AI published in February 2026 noted that governance requirements for autonomous systems are accumulating faster than many deployments have anticipated. For marketing teams deploying voice agents in customer-facing roles, the compliance dimension - particularly in financial services and healthcare, which are among Vapi's strongest verticals - is not secondary to performance. It is a precondition.
Timeline
- Mid-2023 - Jordan Dearsley builds an experimental voice AI system, optimizing for low-latency phone conversations
- March 2024 - Vapi launches publicly on Product Hunt
- February 25-27, 2026 - W3C holds its Workshop on Smart Voice Agents, identifying eight unresolved standards gaps in voice agent interoperability, privacy, and accessibility
- March 31, 2026 - UK regulators - including the CMA, FCA, ICO, and Ofcom - publish a joint foresight paper on agentic AI risks and regulatory gaps
- April 13, 2026 - W3C publishes the official report from its voice agents workshop
- April 14, 2026 - Pacvue launches an AI agent promising commerce media workflows up to 200x faster, illustrating the accelerating pace of agentic AI deployments across enterprise platforms
- April 23, 2026 - AdRoll and PubMatic announce an MCP-powered agent-to-agent integration for deal diagnostics, demonstrating cross-platform agentic AI in live production
- May 12, 2026 - Vapi announces $50 million Series B led by Peak XV, with participation from M12, Kleiner Perkins, and Bessemer Venture Partners; total funding reaches $72 million; platform reports more than 1 billion calls processed and over 2.7 million unique agents created
Summary
Who: Vapi, a San Francisco-based voice AI startup co-founded by Jordan Dearsley and Nikhil Gupta, with investors Peak XV, M12 (Microsoft's Venture Fund), Kleiner Perkins, and Bessemer Venture Partners. Enterprise customers include Amazon Ring, Kavak, ServiceTitan, New York Life, and Intuit.
What: A $50 million Series B funding round, bringing total funding to $72 million. The company simultaneously disclosed that its platform has processed over 1 billion calls, now has more than 1 million developers, and has seen 10x growth in enterprise ARR. The platform provides an API-native infrastructure layer for building, deploying, and managing voice AI agents with support for 200+ model integrations, multi-agent orchestration, MCP tool calling, custom telephony, and multilingual support across 100+ languages.
When: Announced on May 12, 2026. The company launched publicly on Product Hunt in March 2024. The platform's origins date to mid-2023.
Where: Vapi is headquartered in San Francisco, California. Its platform is used globally, with deployments including Amazon Ring in the United States, Kavak in Mexico and Latin America, and enterprise customers across financial services, healthcare, insurance, automotive, and workforce management sectors.
Why: Customer satisfaction scores have declined by 2% since 2022 despite sustained investment in chatbots and automation. Vapi's argument is that rigid, deterministic phone systems cannot adapt to real conversations - and that voice AI infrastructure capable of sub-500ms latency, model-agnostic configuration, and production-grade reliability can finally close that gap at enterprise scale. The funding will support development of governance capabilities including stricter uptime guarantees, stronger agent guardrails, and call-level monitoring as the platform moves into higher-stakes workflows.