summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Add some preliminary ONNX support (#1260)Laurent Mazare2023-11-0410-1/+1033
| | | | | | | | | | | | | | | | | * Add the onnx protos. * Move the reading bits. * Install protoc on the CI. * Install protoc on the cuda CI too. * Use clap for the onnx tool. * Tweak the CI protoc install. * Add some simple evalution function. * Add some binary operator support.
* Update README.md (#1264)Yuchao Zhang2023-11-041-1/+1
|
* Allow using gguf-v3 files. (#1262)Laurent Mazare2023-11-031-5/+15
|
* add distil-whisper link (#1261)Radamés Ajna2023-11-031-35/+49
|
* feat: impl backprop for erf and gelu-erf (#1258)drbh2023-11-033-3/+59
| | | | | | | | | | | * impl backprop for erf anf gelu-erf * feat: unary tests added for erf and gelu-erf * fix: (clippy) remove immediately dereferenced ref * fix: improve comments with pytorch code snippet * fix: adjust comment typo in backprop impl
* add Kalosm to the list of external resources (#1257)ealmloff2023-11-031-0/+1
|
* Backprop support for conv1d (cpu only for now). (#1255)Laurent Mazare2023-11-031-1/+38
|
* Test for the transposed conv1d. (#1254)Laurent Mazare2023-11-032-1/+17
|
* Add vllm external resource (#1253)Eric Buehler2023-11-031-0/+2
|
* Transposed conv1d in candle-nn. (#1252)Laurent Mazare2023-11-031-0/+94
|
* Add the conv-transpose1d op. (#1251)Laurent Mazare2023-11-038-0/+221
| | | | | * Skeleton structure for conv-transpose1d. * CPU implementation for conv-transpose1d.
* Share the layer-norm implementation. (#1248)Laurent Mazare2023-11-032-56/+32
|
* Add the swiglu activation from the chatglm PR. (#1246)Laurent Mazare2023-11-022-0/+7
|
* Add support for distil whisper (#1245)Laurent Mazare2023-11-021-3/+15
| | | | | | | * Add support for distil-whisper. * Add distil-large. * Rename the large model.
* Add hard-sigmoid and hard-swish activations (#1244)jamjamjon2023-11-022-0/+9
| | | | | | | | | | | * Add hard-sigmoid and hard-swish activations * Update ops.rs * Use / rather than div. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* llama2-c wasm fix.Laurent2023-11-021-1/+3
|
* Lazy detach. (#1242)Laurent Mazare2023-11-022-10/+20
|
* Remove the unused pragma for marian. (#1236)Laurent Mazare2023-11-011-4/+32
|
* Consolidate the with-tracing usage. (#1234)Laurent Mazare2023-11-014-102/+8
|
* Preliminary support for ssd1b. (#1233)Laurent Mazare2023-11-012-0/+73
|
* Add a hack for generating random uniform/normal for f16/bf16. (#1228)Laurent Mazare2023-10-311-4/+16
|
* Add a KV cache to marian decoding. (#1226)Laurent Mazare2023-10-313-24/+55
|
* Instructions for generating the tokenizer configs for marian-mt. (#1225)Laurent Mazare2023-10-312-0/+1404
|
* Add support for the marian base model. (#1221)Laurent Mazare2023-10-303-11/+72
|
* Use the hub files for the marian example. (#1220)Laurent Mazare2023-10-304-27/+93
| | | | | | | | | * Use the hub files for the marian example. * Use the secondary decoder. * Add a readme. * More readme.
* PyO3: Add `equal` and `__richcmp__` to `candle.Tensor` (#1099)Lukas Kreussel2023-10-307-3/+325
| | | | | | | | | | | | | | | | | | | * add `equal` to tensor * add `__richcmp__` support for tensors and scalars * typo * more typos * Add `abs` + `candle.testing` * remove duplicated `broadcast_shape_binary_op` * `candle.i16` => `candle.i64` * `tensor.nelements` -> `tensor.nelement` * Cleanup `abs`
* Bugfixes for marian-mt. (#1219)Laurent Mazare2023-10-302-13/+21
| | | | | | | * Bugfixes for marian-mt. * Apply the final decoding head. * More fixes.
* Support negative steps in arange. (#1218)Laurent Mazare2023-10-302-3/+33
|
* PyO3: Better shape handling (#1143)Lukas Kreussel2023-10-2910-50/+181
| | | | | | | | | | | * Negative and `*args` shape handling * Rename to `PyShapeWithHole` + validate that only one hole exists * Regenerate stubs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* Add i64-abs. (#1216)Laurent Mazare2023-10-292-1/+42
|
* Marian MT model (#1210)Laurent Mazare2023-10-295-10/+521
| | | | | | | | | | | | | | | | | | | | | * Skeleton files for the marian MT model. * Marian initialization. * Implement the attention forward method. * Forward pass for the encoder side. * Expose the encoder and decoder. * Start plugging the decoder. * Forward pass for the decoder layer. * Set up the marian example. * Add some missing backtraces. * Bugfix.
* PyO3: Add CI to build & upload wheels as artifacts. (#1215)Lukas Kreussel2023-10-292-1/+1
| | | | | | | * Add maturin ci * fix paths * Change sdist path
* Fix the conv2d gradient computation. (#1214)Laurent Mazare2023-10-292-0/+72
|
* Allow for different behavior between training and eval (#1213)Laurent Mazare2023-10-298-22/+83
| | | | | * Forward with training. * Do not use dropout on vgg evaluation.
* feat: implement VGG13, VGG16 and VGG19 (#1211)drbh2023-10-294-0/+345
| | | | | | | | | | | * feat: implement VGG13, VGG16 and VGG19 * Cosmetic fixes. * More cosmetic tweaks + avoid re-loading the weights on each final layer. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Add DDPG and fix Gym wrapper (#1207)Travis Hammond2023-10-283-25/+549
| | | | | | | | | | | | | | | | | | | | | | * Fix Gym wrapper - It was returning things in the wrong order - Gym now differentiates between terminated and truncated * Add DDPG * Apply fixes * Remove Result annotations * Also remove Vec annotation * rustfmt * Various small improvements (avoid cloning, mutability, get clippy to pass, ...) --------- Co-authored-by: Travis Hammond <travis.hammond@alexanderthamm.com> Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Infer the config for llama2-c. (#1208)Laurent Mazare2023-10-284-4/+63
|
* Move the llama2-c model in transformers. (#1205)Laurent Mazare2023-10-286-9/+12
|
* Make more models cloneable. (#1203)Laurent Mazare2023-10-283-26/+26
|
* No need for the even constraint on vecdot-q40-q80. (#1202)Laurent Mazare2023-10-284-41/+2
|
* Add the relu2 and relu6 activations. (#1201)Laurent Mazare2023-10-273-0/+61
|
* Make the whisper model cloneable (#1200)Laurent Mazare2023-10-272-1/+11
| | | | | | | * Add a quantized variant of llama2.c * Clippy fixes. * Make the whisper model cloneable.
* Add fuse-conv-bn method for Conv2d (#1196)jamjamjon2023-10-273-7/+27
| | | | | | | * Add fuse-conv-bn method for Conv2d * no unwrap * run rustfmp and clippy
* Add a quantized variant of llama2.c (#1197)Laurent Mazare2023-10-275-38/+287
| | | | | * Add a quantized variant of llama2.c * Clippy fixes.
* Minor cleanup (#1194)Laurent Mazare2023-10-271-17/+3
| | | | | * Add some missing backtraces. * Small cleanup.
* Add some missing backtraces. (#1193)Laurent Mazare2023-10-271-6/+12
|
* Add support for the phi-hermes finetuned model. (#1192)Laurent Mazare2023-10-272-3/+28
|
* Use the hub model file when possible. (#1190)Laurent Mazare2023-10-263-7/+71
| | | | | * Use the hub model file when possible. * And add a mention in the main readme.
* Fixes for jina-bert. (#1189)Laurent Mazare2023-10-261-2/+2
|
* Add the jina-bert embeddings model. (#1187)Laurent Mazare2023-10-264-2/+550
| | | | | | | | | | | | | | | | | | | * Add the jina-bert model. * Use alibi. * Remove the unused pragma. * Recompute the alibi embeddings. * Generate the token type ids. * Use the module trait. * Add the jina-bert example. * DType fix. * Get the inference to work.