summaryrefslogtreecommitdiff
path: root/candle-examples/examples/llama/main.rs
Commit message (Expand)AuthorAgeFilesLines
* Add the SmolLM2 models. (#2595)Laurent Mazare2024-11-031-14/+43
* Fix the repo name for llama 3.1. (#2576)Laurent Mazare2024-10-261-2/+2
* Add some llama-3.2 examples. (#2508)Laurent Mazare2024-09-261-1/+13
* Add support for Llama 3.1 (#2359)Eric Buehler2024-07-261-6/+24
* Support top-k in tthe llama example. (#2150)Laurent Mazare2024-05-011-3/+21
* Better time measurement for the llama example. (#2106)Laurent Mazare2024-04-221-2/+5
* Use llama v3 by default + add to readme. (#2094)Laurent Mazare2024-04-201-1/+1
* Also enable llama-v3 8b instruct. (#2088)Laurent Mazare2024-04-191-1/+3
* Llama v3. (#2085)Laurent Mazare2024-04-181-9/+13
* Make the cache for the llama model explicit too. (#1745)Laurent Mazare2024-02-221-3/+3
* Use the tokenizer-output-stream in the llama example. (#1715)Laurent Mazare2024-02-151-11/+9
* fix index_pos bug when kv cache is disabled. (#1517)optman2024-01-061-4/+4
* Add support for tiny-llama-1.1b. (#1512)Laurent Mazare2023-12-311-2/+9
* Rework the llama example config, add the solar model. (#1485)Laurent Mazare2023-12-261-72/+36
* Adapt more examples to the updated safetensor api. (#947)Laurent Mazare2023-09-231-9/+1
* Implement top_p / nucleus sampling (#819)Juarez Bochi2023-09-121-1/+5
* Move some models to candle-transformers so that it's easier to re-use. (#794)Laurent Mazare2023-09-101-2/+1
* Add some optional repeat penalty. (#623)Laurent Mazare2023-08-271-0/+18
* s/panic/bail/Nicolas Patry2023-08-251-2/+2
* Adding support for codellama in examples.Nicolas Patry2023-08-251-5/+15
* Add some tracing to the quantized example. (#473)Laurent Mazare2023-08-161-1/+0
* Using the real config from the hub when available.Nicolas Patry2023-08-161-10/+18
* Tweak the llama example. (#450)Laurent Mazare2023-08-151-63/+14
* Support local weights & dynamic outputs (#447)Guoqing Bao2023-08-151-15/+39
* Add a cuda kernel for upsampling. (#441)Laurent Mazare2023-08-141-2/+2
* Remove the checkpoint conversion script. (#405)Laurent Mazare2023-08-111-3/+0
* Support the Accelerate BLAS on macOS. (#325)Laurent Mazare2023-08-051-0/+3
* Add some tracing to llama. (#318)Laurent Mazare2023-08-031-0/+14
* Support both llama v1 and llama v2. (#272)Laurent Mazare2023-07-281-1/+5
* Upgrading hf-hub to `0.2.0` (Modified API to not pass the Repo aroundNicolas Patry2023-07-271-4/+4
* Switch to using llama-v2 by default. (#251)Laurent Mazare2023-07-261-4/+4
* Better handling of dtypes in llama. (#243)Laurent Mazare2023-07-261-1/+1
* Add flash attention (#241)Laurent Mazare2023-07-261-1/+4
* Support for MQA for llama v2. (#205)Laurent Mazare2023-07-201-29/+18
* Removing `candle-hub` internal to extract into `hf-hub` standalone.Nicolas Patry2023-07-191-1/+1
* Add some 'cuda-if-available' helper function. (#172)Laurent Mazare2023-07-151-14/+1
* Removing cuda default.Nicolas Patry2023-07-141-1/+11
* Add a cli argument to easily switch the dtype. (#161)Laurent Mazare2023-07-131-6/+7
* Sketch the candle-transformers crate. (#147)Laurent Mazare2023-07-121-17/+3
* Use arange in the examples. (#146)Laurent Mazare2023-07-121-4/+3
* Add from_iter and arange, use it in the doctests. (#145)Laurent Mazare2023-07-121-1/+0
* Llama batch (#144)Laurent Mazare2023-07-121-3/+2
* Allow for lazy loading of npz files, use it in llama to reduce memory usage i...Laurent Mazare2023-07-111-5/+1
* Resurrect the llama npy support. (#140)Laurent Mazare2023-07-111-2/+8
* Refactor the llama example to make it more in sync with the other ones. (#139)Laurent Mazare2023-07-111-349/+19
* Add a KV cache to falcon. (#104)Laurent Mazare2023-07-071-2/+1
* Creating new sync Api for `candle-hub`.Nicolas Patry2023-07-061-5/+4
* MKL adjustments. (#87)Laurent Mazare2023-07-061-0/+3
* Add mkl support for matrix multiply. (#86)Laurent Mazare2023-07-061-1/+4
* Support dim indexes in cat.laurent2023-07-051-11/+10