summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLaurent Mazare <laurent.mazare@gmail.com>2023-08-20 14:33:21 +0100
committerGitHub <noreply@github.com>2023-08-20 14:33:21 +0100
commit372f8912c5e69ba9ffbe0856de09cf1667364ddb (patch)
tree7653b463a2f96442ea3beb50ae42479c3e826307
parentd2622a8160a429835ab0a3aa2145ab8c9de4cdd9 (diff)
downloadcandle-372f8912c5e69ba9ffbe0856de09cf1667364ddb.tar.gz
candle-372f8912c5e69ba9ffbe0856de09cf1667364ddb.tar.bz2
candle-372f8912c5e69ba9ffbe0856de09cf1667364ddb.zip
Minor readme tweaks. (#526)
-rw-r--r--README.md18
1 files changed, 11 insertions, 7 deletions
diff --git a/README.md b/README.md
index ef1e55dd..7b98dca8 100644
--- a/README.md
+++ b/README.md
@@ -7,7 +7,7 @@
Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support)
and ease of use. Try our online demos:
[whisper](https://huggingface.co/spaces/lmz/candle-whisper),
-[llama2](https://huggingface.co/spaces/lmz/candle-llama2).
+[LLaMA2](https://huggingface.co/spaces/lmz/candle-llama2).
```rust
let a = Tensor::randn(0f32, 1., (2, 3), &Device::Cpu)?;
@@ -22,7 +22,7 @@ println!("{c}");
Check out our [examples](./candle-examples/examples/):
- [Whisper](./candle-examples/examples/whisper/): speech recognition model.
-- [Llama and Llama-v2](./candle-examples/examples/llama/): general LLM.
+- [LLaMA and LLaMA-v2](./candle-examples/examples/llama/): general LLM.
- [Falcon](./candle-examples/examples/falcon/): general LLM.
- [Bert](./candle-examples/examples/bert/): useful for sentence embeddings.
- [StarCoder](./candle-examples/examples/bigcode/): LLM specialized to code
@@ -32,6 +32,9 @@ Check out our [examples](./candle-examples/examples/):
- [DINOv2](./candle-examples/examples/dinov2/): computer vision model trained
using self-supervision (can be used for imagenet classification, depth
evaluation, segmentation).
+- [Quantized LLaMA](./candle-examples/examples/quantized/): quantized version of
+ the LLaMA model using the same quantization techniques as
+ [llama.cpp](https://github.com/ggerganov/llama.cpp).
Run them using the following commands:
```
@@ -42,6 +45,7 @@ cargo run --example bert --release
cargo run --example bigcode --release
cargo run --example stable-diffusion --release -- --prompt "a rusty robot holding a fire torch"
cargo run --example dinov2 --release -- --image path/to/myinput.jpg
+cargo run --example quantized --release
```
In order to use **CUDA** add `--features cuda` to the example command line. If
@@ -53,7 +57,7 @@ There are also some wasm examples for whisper and
[whisper](https://huggingface.co/spaces/lmz/candle-whisper),
[llama2](https://huggingface.co/spaces/lmz/candle-llama2).
-For llama2, run the following command to retrieve the weight files and start a
+For LLaMA2, run the following command to retrieve the weight files and start a
test server:
```bash
cd candle-wasm-examples/llama2-c
@@ -76,7 +80,7 @@ And then head over to
- CUDA backend for efficiently running on GPUs, multiple GPU distribution via NCCL.
- WASM support, run your models in a browser.
- Included models.
- - LLMs: Llama v1 and v2, Falcon, StarCoder.
+ - LLMs: LLaMA v1 and v2, Falcon, StarCoder.
- Whisper (multi-lingual support).
- Stable Diffusion.
- Computer Vision: DINOv2.
@@ -180,14 +184,14 @@ or for accelerate:
extern crate accelerate_src;
```
-#### Cannot run llama example : access to source requires login credentials
+#### Cannot run the LLaMA examples: access to source requires login credentials
```
Error: request error: https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/tokenizer.json: status code 401
```
-This is likely because you're not permissioned for the llama-v2 model. To fix
-this, you have to register on the huggingface-hub, accept the [llama-v2 model
+This is likely because you're not permissioned for the LLaMA-v2 model. To fix
+this, you have to register on the huggingface-hub, accept the [LLaMA-v2 model
conditions](https://huggingface.co/meta-llama/Llama-2-7b-hf), and set up your
authentication token. See issue
[#350](https://github.com/huggingface/candle/issues/350) for more details.