DeepInfra closes $107M Series B to expand global AI inference cloud

May 14, 2026 | Stephen Mayhew

Categories Digital Infrastructure News | Technology & Architecture

DeepInfra closes $107M Series B to expand global AI inference cloud

DeepInfra, a cloud platform for AI inference, has raised $107 million in Series B funding.

The investment will be utilized to broaden the global capacity of DeepInfra, as AI demand is now transitioning from research prototyping toward production-scale inference.

“When we launched nearly four years ago, we believed inference would become the dominant driver of enterprise AI workloads and we are now at this inflection point,” says Nikola Borisov, co-founder and CEO, DeepInfra. “What’s happening now is incredibly exciting – open-source models are rapidly reaching parity with proprietary systems, unlocking a new wave of innovation at a fraction of the cost and enabling widespread adoption. At the same time, agent-based systems are driving continuous, high-volume demand. Inference is no longer a thin layer – it’s the system constraint that will define the majority of workloads. Most cloud platforms weren’t built for this always-on, distributed model, so we built DeepInfra from the ground up to deliver better economics, performance, and security.”

DeepInfra has agent and open-source powered AI workloads with nearly 5 trillion tokens processed every week.

The platform is purposely built in an agentic era to provide vastly superior economics, performance and security for high-throughput AI workloads. DeepInfra is also an early partner of NVIDIA in the open AI ecosystem.

The investment round was co-led by 500 Global and Georges Harik. DeepInfra aims to be a leading infrastructure provider for the next phase of AI development.

As inference becomes the dominant compute workload, regional and distributed AI infrastructure are gaining importance, and a new generation of specialized “neocloud” providers is emerging to serve GPU-intensive AI applications more efficiently than legacy cloud architectures. DeepInfra’s focus on economics, performance, and distributed deployment aligns closely with current market trends around AI factories, inference-at-scale, and decentralized GPU infrastructure as noted in a recent report by Structure Research.

Previous Article:
Vultr, SUSE and Supermicro target sovereign AI boom with unified cloud-to-edge infrastructure stack

Article Topics

AI infrastructure | cloud infrastructure | DeepInfra | GPUs

DeepInfra closes $107M Series B to expand global AI inference cloud

Related

Article Topics

Comments

Leave a Reply Cancel reply

Digital Infrastructure News