summaryrefslogtreecommitdiff
path: root/candle-core
Commit message (Expand)AuthorAgeFilesLines
* Add i64-abs. (#1216)Laurent Mazare2023-10-292-1/+42
* Marian MT model (#1210)Laurent Mazare2023-10-291-10/+10
* Fix the conv2d gradient computation. (#1214)Laurent Mazare2023-10-292-0/+72
* Allow for different behavior between training and eval (#1213)Laurent Mazare2023-10-292-0/+17
* No need for the even constraint on vecdot-q40-q80. (#1202)Laurent Mazare2023-10-284-41/+2
* Add a quantized variant of llama2.c (#1197)Laurent Mazare2023-10-272-28/+2
* Add some missing backtraces. (#1193)Laurent Mazare2023-10-271-6/+12
* Enable the test for meshgrid + fix the implementation. (#1175)Laurent Mazare2023-10-251-24/+28
* Implemented meshgrid (#1174)Wouter Doppenberg2023-10-252-0/+66
* fix ucopy for `f64` tensors (#1170)Ibiyemi Abiodun2023-10-241-1/+1
* derivative for GELU (#1160)KGrewal12023-10-232-1/+22
* Handle LongStorage in pytorch checkpoints. (#1152)Laurent Mazare2023-10-221-0/+1
* Expose the track-op method. (#1148)Laurent Mazare2023-10-221-1/+1
* Add get_on_dim. (#1142)Laurent Mazare2023-10-211-0/+18
* Add pad_with_same. (#1127)Laurent Mazare2023-10-182-0/+56
* Better error message when overflowing in narrow. (#1119)Laurent Mazare2023-10-181-9/+17
* Refactor the pth tensor exctraction. (#1109)Laurent Mazare2023-10-161-44/+48
* Read all the tensors in a PyTorch pth file. (#1106)Laurent Mazare2023-10-161-0/+13
* Improve the reshape error messages. (#1096)Laurent Mazare2023-10-151-68/+33
* Avoid trying to backprop through non-differentiable layers. (#1094)Laurent Mazare2023-10-141-2/+12
* Create a new curand instead of reseeding. (#1089)Laurent Mazare2023-10-142-1/+10
* Fix the npy read function and add some testing. (#1080)Laurent Mazare2023-10-125-2/+33
* Use full tensors for zeros and ones (#1071)Laurent Mazare2023-10-111-16/+6
* Only optimize float tensors. (#1069)Laurent Mazare2023-10-101-0/+14
* Remove some unusued bits. (#1067)Laurent Mazare2023-10-091-2/+1
* Make the cuda rng seedable. (#1056)Laurent Mazare2023-10-084-0/+16
* Better control on the optional dequantization in QMatMul (#1049)Laurent Mazare2023-10-071-7/+28
* Add the round-to function. (#1039)Laurent Mazare2023-10-052-0/+18
* fix: fix index_select cuda kernel for src target dim different than ids dim w...Gonzalo2023-10-052-2/+13
* Add the rounding operators. (#1030)Laurent Mazare2023-10-044-0/+133
* Simd128 optimized q8k vecdot. (#1026)Laurent Mazare2023-10-032-0/+33
* AVX optimized q8k vecdot. (#1024)Laurent Mazare2023-10-032-0/+35
* Fix for the index-select cuda setup. (#1022)Laurent Mazare2023-10-032-1/+16
* neon optimized q8k multiplication. (#1021)Laurent Mazare2023-10-022-3/+36
* Add the q8k vec-dot multiplication. (#1019)Laurent Mazare2023-10-022-2/+46
* Improve the quantized whisper setup. (#1018)Laurent Mazare2023-10-022-17/+26
* Improve the testing of the optimized quantized vec-dot ops (#1016)Laurent Mazare2023-10-022-5/+68
* Simd128 version of q6k vec-dot. (#1015)Laurent Mazare2023-10-012-1/+127
* Bump the version to 0.3.0. (#1014)Laurent Mazare2023-10-011-1/+1
* Simd128 version of the q2k-q8k vecdot product. (#1011)Laurent Mazare2023-09-303-46/+76
* Quantized version of mistral. (#1009)Laurent Mazare2023-09-301-9/+27
* fix: add missing gpu fill_* (#996)Gonzalo2023-09-291-0/+26
* fixes slice_scatter dim type (#988)Gonzalo2023-09-291-1/+1
* Simd128 q2k vecdot (#982)Laurent Mazare2023-09-282-4/+57
* Optimize the index-select cuda kernel. (#976)Laurent Mazare2023-09-281-4/+4
* Sketch a simd128 optimized q4k vecdot. (#977)Laurent Mazare2023-09-272-1/+97
* Simd128 vec-dot for q4_0. (#974)Laurent Mazare2023-09-272-1/+54
* simd128 optimized q8_0 vecdot (#972)Laurent Mazare2023-09-273-0/+54
* Use the gelu-erf activation. (#969)Laurent Mazare2023-09-261-3/+3
* Avoid some overflows on wasm32. (#968)Laurent Mazare2023-09-262-3/+14