forks/candle.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Detach all grads during backprop. (#1243)	Laurent Mazare	2023-11-05	1	-4/+21
\| \| \| \| \| \| \|	* Detach all grads during backprop. * Add an environment variable to select the backprop behavior. * Update the comment.
*	feat: add backprop for elu (#1269)	drbh	2023-11-04	2	-1/+34
\| \| \| \| \| \| \| \| \|	* feat: add backprop for elu * Cosmetic tweaks. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
*	Allow using gguf-v3 files. (#1262)	Laurent Mazare	2023-11-03	1	-5/+15
\|
*	feat: impl backprop for erf and gelu-erf (#1258)	drbh	2023-11-03	3	-3/+59
\| \| \| \| \| \| \| \| \| \| \|	* impl backprop for erf anf gelu-erf * feat: unary tests added for erf and gelu-erf * fix: (clippy) remove immediately dereferenced ref * fix: improve comments with pytorch code snippet * fix: adjust comment typo in backprop impl
*	Backprop support for conv1d (cpu only for now). (#1255)	Laurent Mazare	2023-11-03	1	-1/+38
\|
*	Test for the transposed conv1d. (#1254)	Laurent Mazare	2023-11-03	2	-1/+17
\|
*	Add the conv-transpose1d op. (#1251)	Laurent Mazare	2023-11-03	8	-0/+221
\| \| \| \| \|	* Skeleton structure for conv-transpose1d. * CPU implementation for conv-transpose1d.
*	Lazy detach. (#1242)	Laurent Mazare	2023-11-02	2	-10/+20
\|
*	Add a hack for generating random uniform/normal for f16/bf16. (#1228)	Laurent Mazare	2023-10-31	1	-4/+16
\|
*	PyO3: Add `equal` and `__richcmp__` to `candle.Tensor` (#1099)	Lukas Kreussel	2023-10-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* add `equal` to tensor * add `__richcmp__` support for tensors and scalars * typo * more typos * Add `abs` + `candle.testing` * remove duplicated `broadcast_shape_binary_op` * `candle.i16` => `candle.i64` * `tensor.nelements` -> `tensor.nelement` * Cleanup `abs`
*	Support negative steps in arange. (#1218)	Laurent Mazare	2023-10-30	2	-3/+33
\|
*	Add i64-abs. (#1216)	Laurent Mazare	2023-10-29	2	-1/+42
\|
*	Marian MT model (#1210)	Laurent Mazare	2023-10-29	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Skeleton files for the marian MT model. * Marian initialization. * Implement the attention forward method. * Forward pass for the encoder side. * Expose the encoder and decoder. * Start plugging the decoder. * Forward pass for the decoder layer. * Set up the marian example. * Add some missing backtraces. * Bugfix.
*	Fix the conv2d gradient computation. (#1214)	Laurent Mazare	2023-10-29	2	-0/+72
\|
*	Allow for different behavior between training and eval (#1213)	Laurent Mazare	2023-10-29	2	-0/+17
\| \| \| \| \|	* Forward with training. * Do not use dropout on vgg evaluation.
*	No need for the even constraint on vecdot-q40-q80. (#1202)	Laurent Mazare	2023-10-28	4	-41/+2
\|
*	Add a quantized variant of llama2.c (#1197)	Laurent Mazare	2023-10-27	2	-28/+2
\| \| \| \| \|	* Add a quantized variant of llama2.c * Clippy fixes.
*	Add some missing backtraces. (#1193)	Laurent Mazare	2023-10-27	1	-6/+12
\|
*	Enable the test for meshgrid + fix the implementation. (#1175)	Laurent Mazare	2023-10-25	1	-24/+28
\|
*	Implemented meshgrid (#1174)	Wouter Doppenberg	2023-10-25	2	-0/+66
\| \| \| \| \| \| \| \| \| \| \|	* Implemented meshgrid * Resolved feedback from LaurentMazare * Rustfmt * Updated docstring * Removed outdated error mode from docstring
*	fix ucopy for `f64` tensors (#1170)	Ibiyemi Abiodun	2023-10-24	1	-1/+1
\|
*	derivative for GELU (#1160)	KGrewal1	2023-10-23	2	-1/+22
\| \| \| \| \|	* derivative for GELU * add tests
*	Handle LongStorage in pytorch checkpoints. (#1152)	Laurent Mazare	2023-10-22	1	-0/+1
\|
*	Expose the track-op method. (#1148)	Laurent Mazare	2023-10-22	1	-1/+1
\|
*	Add get_on_dim. (#1142)	Laurent Mazare	2023-10-21	1	-0/+18
\|
*	Add pad_with_same. (#1127)	Laurent Mazare	2023-10-18	2	-0/+56
\| \| \| \| \| \| \| \| \|	* More model cloning. * More cloning on quantized models. * Add pad-with-same. * Add some tests.
*	Better error message when overflowing in narrow. (#1119)	Laurent Mazare	2023-10-18	1	-9/+17
\|
*	Refactor the pth tensor exctraction. (#1109)	Laurent Mazare	2023-10-16	1	-44/+48
\|
*	Read all the tensors in a PyTorch pth file. (#1106)	Laurent Mazare	2023-10-16	1	-0/+13
\|
*	Improve the reshape error messages. (#1096)	Laurent Mazare	2023-10-15	1	-68/+33
\| \| \| \| \|	* Improve the reshape error messages. * Add the verbose-prompt flag to the phi example.
*	Avoid trying to backprop through non-differentiable layers. (#1094)	Laurent Mazare	2023-10-14	1	-2/+12
\|
*	Create a new curand instead of reseeding. (#1089)	Laurent Mazare	2023-10-14	2	-1/+10
\|
*	Fix the npy read function and add some testing. (#1080)	Laurent Mazare	2023-10-12	5	-2/+33
\|
*	Use full tensors for zeros and ones (#1071)	Laurent Mazare	2023-10-11	1	-16/+6
\| \| \| \| \|	* Only optimize float tensors. * Use full tensors for zeros and ones.
*	Only optimize float tensors. (#1069)	Laurent Mazare	2023-10-10	1	-0/+14
\|
*	Remove some unusued bits. (#1067)	Laurent Mazare	2023-10-09	1	-2/+1
\|
*	Make the cuda rng seedable. (#1056)	Laurent Mazare	2023-10-08	4	-0/+16
\|
*	Better control on the optional dequantization in QMatMul (#1049)	Laurent Mazare	2023-10-07	1	-7/+28
\| \| \| \| \| \| \|	* Cosmetic change to the quantized whisper model. * Fix the dequantization. * Add the dequantize all variable.
*	Add the round-to function. (#1039)	Laurent Mazare	2023-10-05	2	-0/+18
\|
*	fix: fix index_select cuda kernel for src target dim different than ids dim ↵	Gonzalo	2023-10-05	2	-2/+13
\| \| \| \| \| \| \|	when selecting dim > 0 (#1037) * fix: fix index_select cuda kernel for src target dim different than ids dim when selecting dim > 0 * cargo fmt
*	Add the rounding operators. (#1030)	Laurent Mazare	2023-10-04	4	-0/+133
\| \| \| \| \| \| \|	* Add the rounding operators. * Avoid tracking gradients for the rounding operations. * Add some rounding tests.
*	Simd128 optimized q8k vecdot. (#1026)	Laurent Mazare	2023-10-03	2	-0/+33
\|
*	AVX optimized q8k vecdot. (#1024)	Laurent Mazare	2023-10-03	2	-0/+35
\|
*	Fix for the index-select cuda setup. (#1022)	Laurent Mazare	2023-10-03	2	-1/+16
\| \| \| \| \|	* Fix for index-select. * Better fix + add some testing.
*	neon optimized q8k multiplication. (#1021)	Laurent Mazare	2023-10-02	2	-3/+36
\| \| \| \| \| \| \|	* neon optimized q8k multiplication. * Bugfixes. * simdification.
*	Add the q8k vec-dot multiplication. (#1019)	Laurent Mazare	2023-10-02	2	-2/+46
\|
*	Improve the quantized whisper setup. (#1018)	Laurent Mazare	2023-10-02	2	-17/+26
\| \| \| \| \| \| \|	* Improve the quantized whisper setup. * Fix the config file paths. * Use the standard matmul where possible.
*	Improve the testing of the optimized quantized vec-dot ops (#1016)	Laurent Mazare	2023-10-02	2	-5/+68
\| \| \| \| \|	* Expose the unopt functions for testing. * Better testing of the optimized quantized computations.
*	Simd128 version of q6k vec-dot. (#1015)	Laurent Mazare	2023-10-01	2	-1/+127
\| \| \| \| \| \| \|	* Add a specific function for the simd128 q6k vec-dot. * Simdification. * More simdification.
*	Bump the version to 0.3.0. (#1014)	Laurent Mazare	2023-10-01	1	-1/+1
\| \| \| \| \|	* Bump the version to 0.3.0. * Changelog update.