forks/candle.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add some preliminary ONNX support (#1260)	Laurent Mazare	2023-11-04	10	-1/+1033
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add the onnx protos. * Move the reading bits. * Install protoc on the CI. * Install protoc on the cuda CI too. * Use clap for the onnx tool. * Tweak the CI protoc install. * Add some simple evalution function. * Add some binary operator support.
*	Update README.md (#1264)	Yuchao Zhang	2023-11-04	1	-1/+1
\|
*	Allow using gguf-v3 files. (#1262)	Laurent Mazare	2023-11-03	1	-5/+15
\|
*	add distil-whisper link (#1261)	Radamés Ajna	2023-11-03	1	-35/+49
\|
*	feat: impl backprop for erf and gelu-erf (#1258)	drbh	2023-11-03	3	-3/+59
\| \| \| \| \| \| \| \| \| \| \|	* impl backprop for erf anf gelu-erf * feat: unary tests added for erf and gelu-erf * fix: (clippy) remove immediately dereferenced ref * fix: improve comments with pytorch code snippet * fix: adjust comment typo in backprop impl
*	add Kalosm to the list of external resources (#1257)	ealmloff	2023-11-03	1	-0/+1
\|
*	Backprop support for conv1d (cpu only for now). (#1255)	Laurent Mazare	2023-11-03	1	-1/+38
\|
*	Test for the transposed conv1d. (#1254)	Laurent Mazare	2023-11-03	2	-1/+17
\|
*	Add vllm external resource (#1253)	Eric Buehler	2023-11-03	1	-0/+2
\|
*	Transposed conv1d in candle-nn. (#1252)	Laurent Mazare	2023-11-03	1	-0/+94
\|
*	Add the conv-transpose1d op. (#1251)	Laurent Mazare	2023-11-03	8	-0/+221
\| \| \| \| \|	* Skeleton structure for conv-transpose1d. * CPU implementation for conv-transpose1d.
*	Share the layer-norm implementation. (#1248)	Laurent Mazare	2023-11-03	2	-56/+32
\|
*	Add the swiglu activation from the chatglm PR. (#1246)	Laurent Mazare	2023-11-02	2	-0/+7
\|
*	Add support for distil whisper (#1245)	Laurent Mazare	2023-11-02	1	-3/+15
\| \| \| \| \| \| \|	* Add support for distil-whisper. * Add distil-large. * Rename the large model.
*	Add hard-sigmoid and hard-swish activations (#1244)	jamjamjon	2023-11-02	2	-0/+9
\| \| \| \| \| \| \| \| \| \| \|	* Add hard-sigmoid and hard-swish activations * Update ops.rs * Use / rather than div. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
*	llama2-c wasm fix.	Laurent	2023-11-02	1	-1/+3
\|
*	Lazy detach. (#1242)	Laurent Mazare	2023-11-02	2	-10/+20
\|
*	Remove the unused pragma for marian. (#1236)	Laurent Mazare	2023-11-01	1	-4/+32
\|
*	Consolidate the with-tracing usage. (#1234)	Laurent Mazare	2023-11-01	4	-102/+8
\|
*	Preliminary support for ssd1b. (#1233)	Laurent Mazare	2023-11-01	2	-0/+73
\|
*	Add a hack for generating random uniform/normal for f16/bf16. (#1228)	Laurent Mazare	2023-10-31	1	-4/+16
\|
*	Add a KV cache to marian decoding. (#1226)	Laurent Mazare	2023-10-31	3	-24/+55
\|
*	Instructions for generating the tokenizer configs for marian-mt. (#1225)	Laurent Mazare	2023-10-31	2	-0/+1404
\|
*	Add support for the marian base model. (#1221)	Laurent Mazare	2023-10-30	3	-11/+72
\|
*	Use the hub files for the marian example. (#1220)	Laurent Mazare	2023-10-30	4	-27/+93
\| \| \| \| \| \| \| \| \|	* Use the hub files for the marian example. * Use the secondary decoder. * Add a readme. * More readme.
*	PyO3: Add `equal` and `__richcmp__` to `candle.Tensor` (#1099)	Lukas Kreussel	2023-10-30	7	-3/+325
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* add `equal` to tensor * add `__richcmp__` support for tensors and scalars * typo * more typos * Add `abs` + `candle.testing` * remove duplicated `broadcast_shape_binary_op` * `candle.i16` => `candle.i64` * `tensor.nelements` -> `tensor.nelement` * Cleanup `abs`
*	Bugfixes for marian-mt. (#1219)	Laurent Mazare	2023-10-30	2	-13/+21
\| \| \| \| \| \| \|	* Bugfixes for marian-mt. * Apply the final decoding head. * More fixes.
*	Support negative steps in arange. (#1218)	Laurent Mazare	2023-10-30	2	-3/+33
\|
*	PyO3: Better shape handling (#1143)	Lukas Kreussel	2023-10-29	10	-50/+181
\| \| \| \| \| \| \| \| \| \| \|	* Negative and `args` shape handling Rename to `PyShapeWithHole` + validate that only one hole exists * Regenerate stubs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
*	Add i64-abs. (#1216)	Laurent Mazare	2023-10-29	2	-1/+42
\|
*	Marian MT model (#1210)	Laurent Mazare	2023-10-29	5	-10/+521
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Skeleton files for the marian MT model. * Marian initialization. * Implement the attention forward method. * Forward pass for the encoder side. * Expose the encoder and decoder. * Start plugging the decoder. * Forward pass for the decoder layer. * Set up the marian example. * Add some missing backtraces. * Bugfix.
*	PyO3: Add CI to build & upload wheels as artifacts. (#1215)	Lukas Kreussel	2023-10-29	2	-1/+1
\| \| \| \| \| \| \|	* Add maturin ci * fix paths * Change sdist path
*	Fix the conv2d gradient computation. (#1214)	Laurent Mazare	2023-10-29	2	-0/+72
\|
*	Allow for different behavior between training and eval (#1213)	Laurent Mazare	2023-10-29	8	-22/+83
\| \| \| \| \|	* Forward with training. * Do not use dropout on vgg evaluation.
*	feat: implement VGG13, VGG16 and VGG19 (#1211)	drbh	2023-10-29	4	-0/+345
\| \| \| \| \| \| \| \| \| \| \|	* feat: implement VGG13, VGG16 and VGG19 * Cosmetic fixes. * More cosmetic tweaks + avoid re-loading the weights on each final layer. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
*	Add DDPG and fix Gym wrapper (#1207)	Travis Hammond	2023-10-28	3	-25/+549
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix Gym wrapper - It was returning things in the wrong order - Gym now differentiates between terminated and truncated * Add DDPG * Apply fixes * Remove Result annotations * Also remove Vec annotation * rustfmt * Various small improvements (avoid cloning, mutability, get clippy to pass, ...) --------- Co-authored-by: Travis Hammond <travis.hammond@alexanderthamm.com> Co-authored-by: Laurent <laurent.mazare@gmail.com>
*	Infer the config for llama2-c. (#1208)	Laurent Mazare	2023-10-28	4	-4/+63
\|
*	Move the llama2-c model in transformers. (#1205)	Laurent Mazare	2023-10-28	6	-9/+12
\|
*	Make more models cloneable. (#1203)	Laurent Mazare	2023-10-28	3	-26/+26
\|
*	No need for the even constraint on vecdot-q40-q80. (#1202)	Laurent Mazare	2023-10-28	4	-41/+2
\|
*	Add the relu2 and relu6 activations. (#1201)	Laurent Mazare	2023-10-27	3	-0/+61
\|
*	Make the whisper model cloneable (#1200)	Laurent Mazare	2023-10-27	2	-1/+11
\| \| \| \| \| \| \|	* Add a quantized variant of llama2.c * Clippy fixes. * Make the whisper model cloneable.
*	Add fuse-conv-bn method for Conv2d (#1196)	jamjamjon	2023-10-27	3	-7/+27
\| \| \| \| \| \| \|	* Add fuse-conv-bn method for Conv2d * no unwrap * run rustfmp and clippy
*	Add a quantized variant of llama2.c (#1197)	Laurent Mazare	2023-10-27	5	-38/+287
\| \| \| \| \|	* Add a quantized variant of llama2.c * Clippy fixes.
*	Minor cleanup (#1194)	Laurent Mazare	2023-10-27	1	-17/+3
\| \| \| \| \|	* Add some missing backtraces. * Small cleanup.
*	Add some missing backtraces. (#1193)	Laurent Mazare	2023-10-27	1	-6/+12
\|
*	Add support for the phi-hermes finetuned model. (#1192)	Laurent Mazare	2023-10-27	2	-3/+28
\|
*	Use the hub model file when possible. (#1190)	Laurent Mazare	2023-10-26	3	-7/+71
\| \| \| \| \|	* Use the hub model file when possible. * And add a mention in the main readme.
*	Fixes for jina-bert. (#1189)	Laurent Mazare	2023-10-26	1	-2/+2
\|
*	Add the jina-bert embeddings model. (#1187)	Laurent Mazare	2023-10-26	4	-2/+550
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add the jina-bert model. * Use alibi. * Remove the unused pragma. * Recompute the alibi embeddings. * Generate the token type ids. * Use the module trait. * Add the jina-bert example. * DType fix. * Get the inference to work.