summaryrefslogtreecommitdiff
path: root/candle-transformers
Commit message (Expand)AuthorAgeFilesLines
* Add the new gemma models. (#2023)Laurent Mazare2024-04-061-0/+1
* Fix the final rmsnorm for quantized-metavoice. (#2021)Laurent Mazare2024-04-061-0/+1
* Faster mask implementation for mixformers. (#2017)Laurent Mazare2024-04-051-21/+6
* Moondream tracing. (#2016)Laurent Mazare2024-04-052-13/+48
* Add the rope THD kernel. (#2014)Laurent Mazare2024-04-051-22/+6
* Use F16 for moondream on cuda. (#2013)Laurent Mazare2024-04-041-5/+8
* Include topk sampling in the quantized example. (#2005)Laurent Mazare2024-04-041-1/+25
* Relax the contiguous check for cuda kernels. (#2000)Laurent Mazare2024-04-031-1/+2
* Improve the handling of matmul with squeezed layouts. (#1998)Laurent Mazare2024-04-021-1/+1
* Match Moondream's latest release (#1997)Santiago Medina2024-04-021-1/+1
* first commit (#1994)Jorge António2024-04-021-1/+2
* Stable diffusion fix. (#1993)Laurent Mazare2024-04-021-1/+3
* Expose the t5 config fields + allow t5-large. (#1987)Laurent Mazare2024-04-011-16/+16
* Quantized moondream implementation and BOS token (#1980)Santiago Medina2024-04-015-16/+316
* Add options to use local files + specify a custom repo or branch. (#1973)Laurent Mazare2024-03-311-13/+15
* Add Moondream transformer implementation and example (#1970)Santiago Medina2024-03-313-0/+329
* Remove some unnecessary calls to contiguous. (#1968)Laurent Mazare2024-03-301-4/+10
* Qwen MoE model. (#1960)Laurent Mazare2024-03-282-0/+489
* Fix clippy lints + minor cleanups. (#1957)Laurent Mazare2024-03-284-100/+41
* CLIP model implementation with example (#1950)Tigran Zhampeissov2024-03-284-0/+694
* add send and sync trait bounds for scheduler config in stable diffusion model...Jorge António2024-03-281-1/+1
* add config for mamba 2.8b model parameter (#1946)Jorge António2024-03-271-4/+4
* Another fix for squeezing. (#1943)Laurent Mazare2024-03-261-2/+2
* Faster repeat penalty (#1940)Laurent Mazare2024-03-261-3/+7
* Use the new rope kernel in mistral. (#1937)Laurent Mazare2024-03-252-28/+12
* Avoid the attention mask where possible. (#1933)Laurent Mazare2024-03-253-16/+32
* Fast kernels for rotary embeddings. (#1928)Laurent Mazare2024-03-241-26/+5
* Also avoid the mask in the llama example.laurent2024-03-241-2/+6
* Avoid using the attn mask when not necessary.laurent2024-03-241-5/+19
* Support more mistral models. (#1927)Laurent Mazare2024-03-242-24/+31
* Allow for arbitrary temperature modifications.laurent2024-03-231-1/+7
* Add topk sampling. (#1923)Laurent Mazare2024-03-232-24/+88
* Avoid broadcasting on the batch dimension for the attention mask. (#1920)Laurent Mazare2024-03-232-8/+6
* Fix loading the gguf files. (#1913)Laurent Mazare2024-03-221-1/+1
* Fix for the llama model. (#1906)Laurent Mazare2024-03-211-1/+1
* Use the fast RmsNorm in the quantized model. (#1904)Laurent Mazare2024-03-213-35/+21
* Avoid copying the data on squeeze and unsqueeze. (#1884)Laurent Mazare2024-03-202-2/+2
* Use a common with_tracing::RmsNorm in a few models. (#1871)Jani Monoses2024-03-186-111/+29
* Expose some helper functions to create quantized models. (#1837)Laurent Mazare2024-03-123-0/+15
* Add some tracing to metavoice. (#1826)Laurent Mazare2024-03-092-8/+82
* Quantized version of the metavoice model. (#1824)Laurent Mazare2024-03-094-4/+241
* Add a flag to select the dtype used in metavoice. (#1805)Laurent Mazare2024-03-052-5/+13
* Speaker embeddings computation for metavoice. (#1800)Laurent Mazare2024-03-042-23/+109
* Add an initial Segformer implementation (#1617)Jiayu Liu2024-03-032-0/+706
* More metavoice tweaks. (#1796)Laurent Mazare2024-03-031-1/+1
* Metavoice - first cut (#1717)Laurent Mazare2024-03-023-0/+880
* Rustfmt fix. (#1788)Laurent Mazare2024-03-022-3/+10
* Update StableLM config (#1787)Frkri2024-03-022-12/+12
* EfficientVit (MSRA) model (#1783)Jani Monoses2024-03-012-0/+461
* add models of rwkv v6 and quantized rwkv v6 (#1781)Jack Shih2024-03-013-0/+629