Monster API today announces the world’s first GPT-based deployment agent (MonsterGPT) to simplify and speed up the process of fine-tuning and deployment of open source generative AI models, cutting implementation time from what could be take a full day down to 10 minutes, as well as significantly reducing engineering resources.
With simple commands like “fine tune Llama 3,” developers can use the Monster API’s chat interface to fine tune and deploy the model without any need to deal with GPUs, ML environments, Kubernetes and much more.
To customize and run AI models, developers frequently face the challenge of adjusting and controlling as many as 30 variables. This involves not only mastering nuances of latest optimization frameworks for machine learning but also navigating the complexities of the underlying infrastructure, such as GPU cloud setups, containerization, and Kubernetes.
Should any of these variables not perform as expected, it may lead to the failure of the whole deployment process. It’s common for startups to allocate four to 10 engineers for such projects, however, with the Monster API’s GPT, this requirement can be scaled down to just one or two engineers.
Saurabh Vij, CEO of Monster API, explained, “For the first time, we’re offering a solution based on an agent-driven approach, for Generative AI. The ease and speed of this process is like flying in a Mach 4 supersonic jet from New York to London in 90 minutes. At the end of this blazing fast process, MonsterGPT provides developers with an API endpoint for their custom fine-tuned models.”
Said Vij, “As Vinod Khosla, the top VC investor, said recently, ‘There will be a billion+ programmers in the future, all programming in ‘human language.’ Computers will adapt to humans, not humans to computers.’ This quote represents what Monster API’s new technology is enabling: all our research and design is driven to accelerate towards this future faster.”
How Monster API’s Approach Mirrors Past Technology Advances
Throughout history, powerful interfaces have acted as portals, allowing rapid innovation by providing accessible, user-friendly tools. For example, the first Macintosh computer revolutionized personal computing in the 1980s, while Mosaic democratized the internet with its simple browser.
Vij shared, “In today’s AI ecosystem, the open source versus closed source battle mirrors the classic Android versus iPhone rivalry. Just as Android offers a flexible alternative to Apple’s tightly controlled ecosystem, there’s a concerted effort to enhance open source AI models to rival proprietary giants like OpenAI’s GPT-4.”
“Furthermore, the Android vs. iPhone battle has proven that the open source can match and beat the closed source systems,” Vij continued. “Similarly, Monster API believes that the open source models like Llama, Mistral and many others will soon surpass benchmarks set by GPT-4 and other proprietary leaders. This requires easier, faster, more affordable fine-tuning and inference solutions deeply integrated with state of the art quantization methods and algorithms like PagedAttention for boosting the throughput of models.”
“With MonsterGPT, we hope to trigger/initiate a similar portal opening for over 30 million developers who cannot participate in generative AI today because of the inherent complex infrastructure challenges,” Vij added. “By leveraging familiar chat-driven interfaces, we are aligning with the natural evolution of user experience.”
Behind the simple-to-use chat interface, the technology includes some of the most advanced and powerful frameworks like Q-LORA for fine tuning and vLLM for deployments that result in massive gains in efficiency.
Advantages of the Monster API Agent-Driven Approach vs. a Code-Oriented Process
- A unified interface for the full development cycle: From tuning to deployment.
- Great flexibility: Use commands like ‘terminate’ and ‘deploy’ to summon the Agent and the ability to manage projects from your smartphone on the go.
- Significantly easier and faster than code oriented approach
- No need to learn different cloud setups and configurations.
- Use-case vs UI workflow: Instead of manually setting up models in a UI, MonsterGPT suggests and deploys the right model for tasks like sentiment analysis or code generation automatically.
These unique capabilities are already helping customers save precious developer time. Here’s a quote from one of our early design partner/customer:
“Using MonsterAPI to quickly spin up API endpoints has been game-changing for Sanas and a few of our portfolio companies at Carya,” said Sharath Keshava Narayana, co-founder and COO at Sanas. “Saving developer time by not having to worry about cloud config and scaling has been an unlock for our MLOps team, and we can manage the jobs and consumption easily so we do not have to worry about sudden huge AWS bills.”
Vij added, “A developer should just focus on innovation vs. the grunt work they are forced to do today that not just wastes their time but causes massive frustration.”