summaryrefslogtreecommitdiff
path: root/candle-transformers
Commit message (Expand)AuthorAgeFilesLines
...
* Fix cargo fmt. (#2383)Laurent Mazare2024-08-011-0/+1
* Jina Bert Example fix and more configuration (#2191)Joan Fontanals2024-08-011-0/+30
* Add Hiera vision model. (#2382)Jani Monoses2024-08-012-0/+303
* bert attention mask (#1934)Zheng Li2024-08-011-17/+32
* Add support for Llama 3.1 (#2359)Eric Buehler2024-07-2614-50/+125
* feat(candle-transformers/models/codegeex4-9b): add codegeex4-9 (#2334)donjuanplatinum2024-07-212-0/+597
* add quantized qwen2 (#2329)Zhuo Jinggang2024-07-122-0/+324
* Add Mobilenet v4 (#2325)Jani Monoses2024-07-092-0/+801
* Add EVA-02 model ( https://arxiv.org/abs/2303.11331 ) (#2311)v-espitalier2024-07-072-0/+419
* Beit: Add the gen_relative_position_index() function (#2306)v-espitalier2024-07-041-26/+63
* Add Beit model ( https://arxiv.org/abs/2106.08254 ) (#2305)v-espitalier2024-07-012-0/+368
* Add DINOv2Reg4 + PlantCLEF2024 (#2293)v-espitalier2024-06-292-0/+282
* Depth Anything v2 (#2279)Jeroen Vlek2024-06-243-0/+632
* Fix the fast bf16 gemm cublas kernels. (#2274)Laurent Mazare2024-06-181-2/+1
* Support for the new Qwen2 models. (#2257)Laurent Mazare2024-06-071-2/+6
* Add LLaVA support (#2234)chenwanqq2024-06-037-0/+776
* Add Debug, Clone, Deserialize to moondream config (#2222)Dave Lage2024-05-281-0/+1
* Enable the new layer-norm. (#2213)Laurent Mazare2024-05-241-8/+4
* Avoid a contiguous call in the quantized phi 3 model. (#2209)Laurent Mazare2024-05-231-1/+1
* Simplify the KvCache api. (#2207)Laurent Mazare2024-05-231-7/+1
* Use flash-attn in gemma. (#2195)Laurent Mazare2024-05-181-18/+44
* Support flash-attn in quantized phi3. (#2194)Laurent Mazare2024-05-181-10/+40
* Add a slice_set op. (#2193)Laurent Mazare2024-05-181-22/+19
* Support embedding model gte-Qwen1.5-7B-instruct (#2190)Yin Guobing2024-05-161-15/+62
* Separate quantized phi-3 implementation. (#2157)Laurent Mazare2024-05-043-4/+306
* Bump the version number to 0.5.1. (#2155)Laurent Mazare2024-05-031-1/+1
* Add argsort. (#2132)Laurent Mazare2024-04-272-43/+21
* Add Olmo models (#2127)Isotr0py2024-04-262-0/+338
* Add the phi-3 model. (#2120)Laurent Mazare2024-04-242-0/+330
* Use the faster rms-norm kernel for llama. (#2107)Laurent Mazare2024-04-221-0/+5
* Updated quantized phi model (#2099)Laurent Mazare2024-04-212-0/+289
* Derive clone and debug traits for Moondream model (#2100)Santiago Medina2024-04-211-0/+1
* Small cleanups to the llama multi-process example. (#2098)Laurent Mazare2024-04-201-1/+7
* Fix for gemma MQA. (#2091)Laurent Mazare2024-04-191-2/+3
* Use faster rotary embeddings for llama like models. (#2087)Laurent Mazare2024-04-181-11/+6
* Llama v3. (#2085)Laurent Mazare2024-04-181-0/+10
* Make the falcon model cloneable. (#2067)Laurent Mazare2024-04-151-5/+5
* Add a function to clear the KV cache in falcon. (#2066)Laurent Mazare2024-04-151-0/+14
* Add a quantized version of recurrent-gemma. (#2054)Laurent Mazare2024-04-134-61/+477
* Avoid crashes when running T5 models with F16 tensors on CPU (#2047)Victor-Mihaila2024-04-131-1/+1
* Change for the encoder-only ProstT5 model (#2045)Victor-Mihaila2024-04-131-1/+3
* Add the recurrent-gemma model. (#2039)Laurent Mazare2024-04-132-0/+641
* Use cat for faster MQA computation. (#2043)Laurent Mazare2024-04-1216-195/+47
* Add the code-gemma models. (#2038)Laurent Mazare2024-04-101-4/+15
* Support alternative dtypes for mamba (#2036)Laurent Mazare2024-04-103-8/+15
* Add the new gemma models. (#2023)Laurent Mazare2024-04-061-0/+1
* Fix the final rmsnorm for quantized-metavoice. (#2021)Laurent Mazare2024-04-061-0/+1
* Faster mask implementation for mixformers. (#2017)Laurent Mazare2024-04-051-21/+6
* Moondream tracing. (#2016)Laurent Mazare2024-04-052-13/+48
* Add the rope THD kernel. (#2014)Laurent Mazare2024-04-051-22/+6