summaryrefslogtreecommitdiff
path: root/candle-transformers
Commit message (Expand)AuthorAgeFilesLines
* Share the layer-norm implementation. (#1248)Laurent Mazare2023-11-032-56/+32
* Remove the unused pragma for marian. (#1236)Laurent Mazare2023-11-011-4/+32
* Consolidate the with-tracing usage. (#1234)Laurent Mazare2023-11-014-102/+8
* Preliminary support for ssd1b. (#1233)Laurent Mazare2023-11-012-0/+73
* Add a KV cache to marian decoding. (#1226)Laurent Mazare2023-10-311-14/+40
* Add support for the marian base model. (#1221)Laurent Mazare2023-10-301-0/+25
* Use the hub files for the marian example. (#1220)Laurent Mazare2023-10-301-10/+29
* Bugfixes for marian-mt. (#1219)Laurent Mazare2023-10-301-9/+18
* Marian MT model (#1210)Laurent Mazare2023-10-293-0/+421
* Allow for different behavior between training and eval (#1213)Laurent Mazare2023-10-291-16/+19
* feat: implement VGG13, VGG16 and VGG19 (#1211)drbh2023-10-292-0/+255
* Infer the config for llama2-c. (#1208)Laurent Mazare2023-10-282-1/+50
* Move the llama2-c model in transformers. (#1205)Laurent Mazare2023-10-285-0/+712
* Make more models cloneable. (#1203)Laurent Mazare2023-10-283-26/+26
* Add the relu2 and relu6 activations. (#1201)Laurent Mazare2023-10-272-0/+57
* Make the whisper model cloneable (#1200)Laurent Mazare2023-10-272-1/+11
* Add support for the phi-hermes finetuned model. (#1192)Laurent Mazare2023-10-271-0/+17
* Fixes for jina-bert. (#1189)Laurent Mazare2023-10-261-2/+2
* Add the jina-bert embeddings model. (#1187)Laurent Mazare2023-10-263-2/+388
* [Wasm] BLIP Example (#1183)Radamés Ajna2023-10-263-3/+12
* [Wasm] Add puffin phi model to wasm (#1166)Radamés Ajna2023-10-251-1/+2
* Add a quantized blip model. (#1155)Laurent Mazare2023-10-224-0/+742
* Add some KV cache to blip. (#1150)Laurent Mazare2023-10-222-17/+60
* Remove the unused pragma and properly apply the bias. (#1147)Laurent Mazare2023-10-223-22/+15
* Blip attention mask + readme (#1146)Laurent Mazare2023-10-211-13/+49
* Blip fixes (#1145)Laurent Mazare2023-10-211-27/+0
* Add the blip example. (#1144)Laurent Mazare2023-10-212-45/+169
* Blip forward pass (#1141)Laurent Mazare2023-10-211-5/+42
* Add the blip image captioning model (#1140)Laurent Mazare2023-10-204-2/+595
* Add some vision transformers models (#1132)Laurent Mazare2023-10-192-0/+383
* Support ResNet 50/101/152. (#1130)Laurent Mazare2023-10-191-0/+118
* Experiment with resnet (#1128)Laurent Mazare2023-10-192-0/+132
* More model cloning. (#1126)Laurent Mazare2023-10-184-19/+19
* Make some model cloneable. (#1125)Laurent Mazare2023-10-185-20/+25
* Add the quantized mpt model. (#1123)Laurent Mazare2023-10-184-4/+211
* Remove the unused pragma in mpt. (#1122)Laurent Mazare2023-10-181-5/+4
* MPT alibi fixes. (#1120)Laurent Mazare2023-10-181-12/+18
* MPT fixes. (#1117)Laurent Mazare2023-10-171-12/+21
* Build alibi bias. (#1115)Laurent Mazare2023-10-171-6/+94
* Add the MPT model. (#1114)Laurent Mazare2023-10-172-0/+203
* Add support for Puffin-Phi-v2. (#1110)Laurent Mazare2023-10-161-0/+17
* Convmixer (#1073)Laurent Mazare2023-10-112-0/+83
* Tracing for StableLM and quantized StableLM. (#1068)Laurent Mazare2023-10-102-0/+24
* Move the common quantized-nn code to a shared module. (#1063)Laurent Mazare2023-10-097-166/+100
* Quantized version of StableLM. (#1058)Laurent Mazare2023-10-083-7/+306
* Use softmax-last-dim where possible. (#1057)Laurent Mazare2023-10-085-5/+5
* Do not use the kv-cache on external key-value states. (#1054)Laurent Mazare2023-10-072-14/+14
* Add flash-attn support for stable-lm. (#1052)Laurent Mazare2023-10-071-2/+29
* Better control on the optional dequantization in QMatMul (#1049)Laurent Mazare2023-10-071-6/+5
* Add the stable-lm example. (#1046)Laurent Mazare2023-10-061-4/+13