After announcing its new multimodal AI model Gemini last week, Google is making several announcements today to enable developers to build with it.
When first announced, Google said that Gemini will come in three different versions, each tailored to a different size or complexity requirement. In order from largest to smallest, Gemini is available in Ultra, Pro, and Nano versions. Gemini Nano has already seen use in Android in the Pixel 8 Pro and Google Bard is also already using a specialized version of Gemini Pro.
RELATED CONTENT: Google’s Duet AI for Developers is now generally available
Today, Google is announcing that developers can use Gemini Pro through the Gemini API. Initial features that developers can leverage include function calling, embeddings, semantic retrieval, custom knowledge grounding, and chat functionality, the company explained.
There are two main ways to work with Gemini Pro: Google AI Studio and Vertex AI on Google Cloud. Google AI Studio is a web-based developer tool that is easy to get started with. It has a free quota that allows up to 60 requests per minute and offers quickstart templates to enable developers to get started.
Vertex AI on Google Cloud is a machine learning platform that Google says is sort of a step up from Google Studio AI in terms of complexity, where developers can fully customize Gemini and access benefits like full data control and integration with other Google Cloud features to support security, safety, privacy, governance, and compliance.
Currently, it will be free to use Gemini in Vertex AI at the same rate limit as the free quota of Google AI Studio until it reaches general availability next year. Once generally available, inputs will cost $0.00025 for 1000 characters and $0.0025 per image.
According to Google, some of the more complex capabilities enabled by working in Vertex AI include the ability to augment Gemini with company data and build search and conversational agents in a low-code environment.
Currently, Gemini Pro accepts text as input and also outputs text, but for developers wanting to experiment with images, there is a dedicated Gemini Pro Vision endpoint that also accepts images along with text in inputs, and outputs text.
Looking forward to the future, developers can anticipate Google to launch Gemini Ultra early next year, which is a larger model that is suited for complex tasks. The company is also working to bring Gemini to the Chrome and Firebase developer platforms.
In addition, another announcement the company made today is the release of the next generation of Google’s image-generation model, Imagen 2. It is now available for all Vertex AI customers on Google’s allowlist.
Imagen 2 enables the creation of “high-quality, photorealistic, high-resolution, aesthetically pleasing” images using natural language prompts. New features in this iteration include text rendering to create text overlays on images, logo generation, and visual question and answering for caption generation.