Cloud computing platform Vultr today launched a new serverless Inference-as-a-Service platform with AI model deployment and inference capabilities.

Vultr Cloud Inference offers customers scalability, reduced latency and delivers cost efficiencies, according to the company announcement.

For the uninitiated, AI inference is a process that uses a trained AI model to make predictions against new data. So, when the AI model is being trained, it learns patterns and relationships with which it can generalize on new data. Inference is when the model applies that learned knowledge to help organizations make customer-personalized, data-driven decisions by using those accurate predictions, as well as to generate text and images.

The pace of innovation and the rapidly evolving digital landscape have challenged businesses worldwide to deploy and manage AI models efficiently. Organizations are struggling with complex infrastructure management, and the need for seamless, scalable deployment across different geographies. This has left AI product managers and CTOs in constant search of solutions that can simplify the deployment process. 

“With Vultr Cloud Inference … we have designed a pivotal solution to these challenges, offering a global, self-optimizing platform for the deployment and serving of AI models,” Kevin Cochrane, chief marketing officer at Vultr, told SD Times. “In essence, Vultr Cloud Inference provides a technological foundation that empowers organizations to deploy AI models globally, ensuring low-latency access and consistent user experiences worldwide, thereby transforming the way businesses innovate and scale with AI.”

This is important for organizations that need to optimize AI models for different regions while maintaining high availability and low latency throughout the distributed server infrastructure. WIth Vultr Cloud Inference, users can have their own models – regardless of the platforms they were trained on – integrated and deployed on Vultr’s infrastructure, powered by NVIDIA GPUs.

According to Vultr’s Cochrane, “This means that AI models are served intelligently on the most optimized NVIDIA hardware available, ensuring peak performance without the hassle of manual scale. With a serverless architecture, businesses can concentrate on innovation and creating value through their AI initiatives rather than focusing on infrastructure management.” 

Vultr’s infrastructure is global, spanning six continents and 32 locations, and, according to the company’s announcement, Vultr Cloud Inference “ensures that businesses can comply with local data sovereignty, data residency and privacy regulations by deploying their AI applications in regions that align with legal requirements and business objectives.”