From 45d5322d62d59a2c9be5fe8c642d0fa56fbb73b1 Mon Sep 17 00:00:00 2001 From: Laurent Mazare <laurent.mazare@gmail.com> Date: Wed, 21 Feb 2024 22:02:50 +0100 Subject: Add the Gemma models. (#1741) * Add the Gemma models. * Add the gemma example. * Adapt the RmsNorm. * Get the 2b model to work. * 7b support. * Use the config head dim. * Yet another fix. * Make the matrixes contiguous. * Also get the 7b model to work. * And add to the readme. --- README.md | 3 +++ 1 file changed, 3 insertions(+) (limited to 'README.md') diff --git a/README.md b/README.md index 5c65ef68..0119684e 100644 --- a/README.md +++ b/README.md @@ -63,6 +63,8 @@ We also provide a some command line based examples using state of the art models - [LLaMA and LLaMA-v2](./candle-examples/examples/llama/): general LLM, includes the SOLAR-10.7B variant. - [Falcon](./candle-examples/examples/falcon/): general LLM. +- [Gemma](./candle-examples/examples/gemma/): 2b and 7b general LLMs from Google + Deepmind. - [Phi-1, Phi-1.5, and Phi-2](./candle-examples/examples/phi/): 1.3b and 2.7b general LLMs with performance on par with LLaMA-v2 7b. - [StableLM-3B-4E1T](./candle-examples/examples/stable-lm/): a 3b general LLM pre-trained on 1T tokens of English and code datasets. Also supports @@ -190,6 +192,7 @@ If you have an addition to this list, please submit a pull request. - StarCoder. - Phi 1, 1.5, and 2. - Mamba, Minimal Mamba + - Gemma 2b and 7b. - Mistral 7b v0.1. - Mixtral 8x7b v0.1. - StableLM-3B-4E1T, StableLM-2-1.6B, Stable-Code-3B. -- cgit v1.2.3