summaryrefslogtreecommitdiff
path: root/candle-transformers
Commit message (Expand)AuthorAgeFilesLines
* Avoid the attention mask where possible. (#1933)Laurent Mazare2024-03-253-16/+32
* Fast kernels for rotary embeddings. (#1928)Laurent Mazare2024-03-241-26/+5
* Also avoid the mask in the llama example.laurent2024-03-241-2/+6
* Avoid using the attn mask when not necessary.laurent2024-03-241-5/+19
* Support more mistral models. (#1927)Laurent Mazare2024-03-242-24/+31
* Allow for arbitrary temperature modifications.laurent2024-03-231-1/+7
* Add topk sampling. (#1923)Laurent Mazare2024-03-232-24/+88
* Avoid broadcasting on the batch dimension for the attention mask. (#1920)Laurent Mazare2024-03-232-8/+6
* Fix loading the gguf files. (#1913)Laurent Mazare2024-03-221-1/+1
* Fix for the llama model. (#1906)Laurent Mazare2024-03-211-1/+1
* Use the fast RmsNorm in the quantized model. (#1904)Laurent Mazare2024-03-213-35/+21
* Avoid copying the data on squeeze and unsqueeze. (#1884)Laurent Mazare2024-03-202-2/+2
* Use a common with_tracing::RmsNorm in a few models. (#1871)Jani Monoses2024-03-186-111/+29
* Expose some helper functions to create quantized models. (#1837)Laurent Mazare2024-03-123-0/+15
* Add some tracing to metavoice. (#1826)Laurent Mazare2024-03-092-8/+82
* Quantized version of the metavoice model. (#1824)Laurent Mazare2024-03-094-4/+241
* Add a flag to select the dtype used in metavoice. (#1805)Laurent Mazare2024-03-052-5/+13
* Speaker embeddings computation for metavoice. (#1800)Laurent Mazare2024-03-042-23/+109
* Add an initial Segformer implementation (#1617)Jiayu Liu2024-03-032-0/+706
* More metavoice tweaks. (#1796)Laurent Mazare2024-03-031-1/+1
* Metavoice - first cut (#1717)Laurent Mazare2024-03-023-0/+880
* Rustfmt fix. (#1788)Laurent Mazare2024-03-022-3/+10
* Update StableLM config (#1787)Frkri2024-03-022-12/+12
* EfficientVit (MSRA) model (#1783)Jani Monoses2024-03-012-0/+461
* add models of rwkv v6 and quantized rwkv v6 (#1781)Jack Shih2024-03-013-0/+629
* Add the StarCoder2 model. (#1779)Laurent Mazare2024-02-282-0/+348
* Encodec encoding demo. (#1775)Laurent Mazare2024-02-281-1/+2
* Apply dilations in the encodec model. (#1772)Laurent Mazare2024-02-271-19/+69
* Encodec model. (#1771)Laurent Mazare2024-02-272-0/+719
* Avoid tensor copying in the quantized example. (#1770)Laurent Mazare2024-02-271-4/+8
* add quantized rwkv v5 model (#1743)Jack Shih2024-02-253-2/+288
* Tweak the VarMap set type. (#1758)Laurent Mazare2024-02-252-9/+9
* Make the cache for the llama model explicit too. (#1745)Laurent Mazare2024-02-221-32/+38
* Explicit caching in llama2.c.laurent2024-02-222-55/+78
* Support for attention bias in gemma + refactor things a bit. (#1744)Laurent Mazare2024-02-225-39/+18
* Add the Gemma models. (#1741)Laurent Mazare2024-02-212-0/+381
* Make the r, k, v tensors contiguous. (#1719)Laurent Mazare2024-02-161-3/+3
* Custom tokenizer for rwkv. (#1711)Laurent Mazare2024-02-141-0/+92
* Add the RWKV model (v5). (#1707)Laurent Mazare2024-02-143-2/+319
* Add ConvNeXt-V2 and smaller model variants. (#1709)Jani Monoses2024-02-141-36/+174
* Fixing quantized llama demo on metal. (#1703)Nicolas Patry2024-02-131-13/+15
* feat: support microphone whisper streaming (#1678)drbh2024-02-122-0/+53
* Improved mamba model optimized for inference (#1694)Laurent Mazare2024-02-112-0/+212
* Support sinusoidal embeddings in trocr. (#1690)Laurent Mazare2024-02-101-12/+56
* Use the repo config for trocr rather than hardcoding it + small tweaks. (#1689)Laurent Mazare2024-02-102-13/+16
* Remove the unused pragma in vit + handle the final layernorm. (#1688)Laurent Mazare2024-02-101-7/+9
* Add the Qwen2 model (#1684)Laurent Mazare2024-02-092-0/+378
* Add the ChatGLM model. (#1237)Laurent Mazare2024-02-092-0/+594
* feat: support multithread spectrogram and small perf tweaks (#1674)drbh2024-02-083-28/+150
* Quantized support for stable-lm2. (#1654)Laurent Mazare2024-02-041-4/+9