Google has announced that it is extending the Gemma family of AI models with two new variants, one for code generation and one for inference. 

For code generation, it is releasing CodeGemma, which provides intelligent code completion and generation. It is capable of producing entire blocks of code at a time, Google claims. 

According to Google, CodeGemma was trained on 500 billion tokens from web documents, mathematics, and code, and can be used with multiple popular programming languages. 

It is available in several different variants itself, including a 7B pretrained version that specializes in code generation and completion, a 7B instruction-tuned version that is good at code chat and instruction following, and a 2B pretrained variant for fast code completion on local devices. 

RecurrentGemma is designed to improve inference at higher batch sizes, which is useful for researchers.

It offers lower memory requirements, allowing it to be used to generate samples of devices with limited memory. Because of the lower memory usage, it can also handle higher batch sizes at more tokens per second. 

The two models are now available to try out on Kaggle, Hugging Face, and Vertex AI Model Garden.