Google has announced three new products that are part of the Gemma 2 family, a series of open AI models that were introduced in June. The new offerings include Gemma 2 2B, ShieldGemma, and Gemma Scope.
Gemma 2 2B is a 2 billion parameter option, joining the existing 27 billion and 9 billion parameter sizes. According to Google, this new size balances performance with efficiency, and can outperform other models in its category, including all GPT-3.5 models.
It is optimized with the NVIDIA TensorRT-LLM library and is available as an NVIDIA NIM, making it ideal for a variety of deployment types, such as data centers, cloud, local workstations, PCs, and edge devices. Gemma 2 2B also integrates with Keras, JAX, Hugging Face, NVIDIA NeMo, Ollama, and Gemma.cpp, and will soon integrate with MediaPipe as well.
And because of its small size, it can run on the free tier of T4 GPUs in Google Colab, which Google believes will make “experimentation and development easier than ever.”
It is available now via Kaggle, Hugging Face, or Vertex AI Model Garden, and can be used within Google AI Studio.
Next, ShieldGemma is a series of safety classifiers for detecting harmful content in model inputs and outputs. It specifically targets hate speech, harassment, sexually explicit content, and dangerous content. The ShieldGemma models are open and designed to enable collaboration and transparency in the AI development community, and add to the existing suite of safety classifiers in the company’s Responsible AI Toolkit.
It is available in different model sizes to meet different needs. For example, the 2B model is ideal for online classification, whereas the 9B and 27B can provide better performance for offline scenarios where latency isn’t a concern. According to Google, all model sizes use NVIDIA speed optimizations to improve performance.
And finally, Gemma Scope provides better transparency into how Gemma 2 models come to their decisions, and can enable researchers to understand how Gemma 2 identifies patterns, processes information, and makes predictions. It uses sparse autoencoders (SAEs) to look at specific points in the model and “unpack the dense, complex information processed by Gemma 2, expanding it into a form that’s easier to analyze and understand,” Google explained in a blog post.
“These releases represent our ongoing commitment to providing the AI community with the tools and resources needed to build a future where AI benefits everyone. We believe that open access, transparency, and collaboration are essential for developing safe and beneficial AI,” Google wrote.
You may also like…
Google releases Gemma, a new AI model designed with AI researchers in mind
RAG is the next exciting advancement for LLMs