Google has announced that developers now have access to a 2 million context window for Gemini Pro 1.5. For comparison, GPT-4o has a 128k context window. 

This context window length was first announced at Google I/O and accessible only through a waitlist, but now everyone has access.

Longer context windows can lead to higher costs, so Google also announced support for context caching in the Gemini API for Gemini 1.5 Pro and 1.5 Flash. This allows context to be stored for use in later queries, which reduces costs for tasks that reuse tokens across prompts. 

Additionally, Google has announced that code execution is now enabled for both Gemini 1.5 Pro and 1.5 Flash. This feature allows the model to generate and run Python code and then iterate on it until the desired result is achieved.

According to Google, the execution sandbox isn’t connected to the internet, comes with a few numerical libraries pre-installed, and bills developers based on the output tokens from the model.

And finally, Gemma 2 is now available in Google AI Studio and Gemini 1.5 Flash tuning will be available via the Gemini API or Google AI Studio sometime next month. 


You may also like…

Anthropic’s new Claude 3.5 Sonnet model already competitive with GPT-4o and Gemini 1.5 Pro on multiple benchmarks

Gemini improvements unveiled at Google Cloud Next