...

Nebius Launches Nebius Token Factory to Deliver Production AI Inference at Scale

Nebius Token Factory

The Amsterdam-based company Nebius today launched its new platform, Nebius Token Factory, to enable production-grade AI inference at scale. In this announcement, Nebius emphasises that the solution addresses enterprise AI infrastructure needs and open-source model deployment. AI inference is moving fast, and enterprises need reliable, cost-efficient systems.

Meanwhile, Token Factory brings together post-training, fine-tuning, and high-performance inference endpoints in a single governed platform. The platform supports major open-source models such as NVIDIA Nemotron, DeepSeek, GPT-OSS by OpenAI, Llama, Qwen, and more. In addition, it allows customers to host their own models. 

Moreover, open-source models unlock innovation and cost efficiency, but managing them in production has been complex. Nebius Token Factory addresses that by delivering sub-second latency, autoscaling throughput, and 99.9% uptime even for hundreds of millions of inference requests per minute. 

According to Roman Chernin, Co-founder and Chief Business Officer at Nebius, “We built Nebius Token Factory to help customers solve real challenges and engineer for scale.” 

Enterprise Use Cases and Infrastructure Highlights

Early adopters include Prosus, which reported up to 26× cost reductions compared to proprietary models. The company handled up to 200 billion tokens per day using dedicated endpoints with guaranteed performance. 

Then there is Higgsfield AI, a video-AI platform, which chose Nebius because it simplified management, reduced overhead, and delivered faster inference at scale. 

In addition, Nebius has partnered with Hugging Face to improve developer access and scalability for open-model inference. 

From a technical standpoint, Nebius Token Factory is built atop Nebius’s full-stack AI Cloud 3.0 “Aether”. The infrastructure supports enterprise-grade security certifications such as SOC 2 Type II, HIPAA, ISO 27001, and ISO 27799. 

Furthermore, the platform features: dedicated endpoints, zero-retention inference in selected datacenters, support for 40+ open-source models, seamless one-click deployment, SSO, team access management, unified billing, and OpenAI-compatible APIs. 

In short, Nebius Token Factory positions Nebius as a key player in the enterprise AI deployment market. As companies move from testing to building AI infrastructure, tools like Token Factory provide the efficiency, governance, and scale they require.

Therefore, companies seeking a production-ready AI inference platform should evaluate Nebius Token Factory now. 

Explore IT Tech News for the latest advancements in Information Technology & insightful updates from industry experts!

News Source: Businesswire.com

Share with friends

Latest News