trisfromgoogle (Tris Warkentin)

Posts 2

Post

1832

Very excited to share the first two official Gemma variants from Google! Today at Google Cloud Next, we announced cutting-edge models for code and research!

First, google/codegemma-release-66152ac7b683e2667abdee11 - a new set of code-focused Gemma models at 2B and 7B, in both pretrained and instruction-tuned variants. These exhibit outstanding performance on academic benchmarks and (in my experience) real-life usage. Read more in the excellent HuggingFace blog: https://huggingface.co/blog/codegemma

Second, ( google/recurrentgemma-release-66152cbdd2d6619cb1665b7a), which is based on the outstanding Google DeepMind research in Griffin: https://arxiv.org/abs/2402.19427. RecurrentGemma is a research variant that enables higher throughput and vastly improved memory usage. We are excited about new architectures, especially in the lightweight Gemma sizes, where innovations like RecurrentGemma can scale modern AI to many more use cases.

For details on the launches of these models, check out our launch blog -- and please do not hesitate to send us feedback. We are excited to see what you build with CodeGemma and RecurrentGemma!

Huge thanks to the Hugging Face team for helping ensure that these models work flawlessly in the Hugging Face ecosystem at launch!

Post

I am thrilled to announce Gemma, new 2B and 7B models from Google, based on the same research and technology used to train the Gemini models! These models achieve state-of-the-art performance for their size, and are launched across Transformers, Google Cloud, and many other surfaces worldwide starting today.

Get started using and adapting Gemma in the model Collection: google/gemma-release-65d5efbccdbb8c4202ec078b

These launches are the product of an outstanding collaboration between the Google DeepMind and Hugging Face teams over the last few months -- very proud of the work both teams have done, from integration with Vertex AI to optimization across the stack. Read more about the partnership in the main launch by @philschmid @osanseviero @pcuenq on the launch blog: https://huggingface.co/blog/gemma

More information below if you are curious about training details, eval results, and safety characteristics!

Gemma Tech Report: https://goo.gle/GemmaReport
Launch announcement: https://blog.google/technology/developers/gemma-open-models/

Papers 4

models

None public yet

datasets

None public yet

Tris Warkentin

AI & ML interests

Organizations

Posts 2

Papers 4

models

datasets