Edge Infrastructure Review

DeepInfra closes $107M Series B to expand global AI inference cloud

DeepInfra closes $107M Series B to expand global AI inference cloud

DeepInfra, a cloud platform for AI inference, has raised $107 million in Series B funding.

The investment will be utilized to broaden the global capacity of DeepInfra, as AI demand is now transitioning from research prototyping toward production-scale inference.

“When we launched nearly four years ago, we believed inference would become the dominant driver of enterprise AI workloads and we are now at this inflection point,” says Nikola Borisov, co-founder and CEO, DeepInfra. “What’s happening now is incredibly exciting – open-source models are rapidly reaching parity with proprietary systems, unlocking a new wave of innovation at a fraction of the cost and enabling widespread adoption. At the same time, agent-based systems are driving continuous, high-volume demand. Inference is no longer a thin layer – it’s the system constraint that will define the majority of workloads. Most cloud platforms weren’t built for this always-on, distributed model, so we built DeepInfra from the ground up to deliver better economics, performance, and security.”

DeepInfra has agent and open-source powered AI workloads with nearly 5 trillion tokens processed every week.

The platform is purposely built in an agentic era to provide vastly superior economics, performance and security for high-throughput AI workloads. DeepInfra is also an early partner of NVIDIA in the open AI ecosystem.

The investment round was co-led by 500 Global and Georges Harik. DeepInfra aims to be a leading infrastructure provider for the next phase of AI development.

As inference becomes the dominant compute workload, regional and distributed AI infrastructure are gaining importance, and a new generation of specialized “neocloud” providers is emerging to serve GPU-intensive AI applications more efficiently than legacy cloud architectures. DeepInfra’s focus on economics, performance, and distributed deployment aligns closely with current market trends around AI factories, inference-at-scale, and decentralized GPU infrastructure as noted in a recent report by Structure Research.

Related Posts

Article Topics

 |   |   | 

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Featured Company

Sponsored Links

DataBank: We believe there is a different edge to be served - the “middle edge" - that will become the first step for many in their journey to the edge. https://www.databank.com/

Latitude.sh: Where the power of bare metal meets the flexibility of the cloud. Deploy physical servers across 23 global locations in as little as 5 seconds. https://www.latitude.sh/

NodeWeaver: Minimizes the total lifecycle cost of deploying, managing, and operating edge compute by addressing the main drivers of cost and complexity.​ https://www.nodeweaver.eu/

OnLogic: A global industrial PC manufacturer and solution provider focused on hardware for IoT and edge AI, designing highly-configurable computers engineered for reliability. https://www.onlogic.com/

Zenlayer: A massively distributed edge cloud service provider operating over 270 PoPs around the world, with expertise in fast-growing emerging markets. https://www.zenlayer.com/

Latest News