summaryrefslogtreecommitdiff
path: root/candle-core
Commit message (Expand)AuthorAgeFilesLines
* Bump the ug dependency. (#2720)Laurent Mazare2025-01-161-1/+1
* Clippy fixes for 1.84. (#2710)Laurent Mazare2025-01-101-4/+1
* Fix a cuda warning. (#2693)Laurent Mazare2024-12-311-39/+44
* Add a Context trait similar to anyhow::Context. (#2676)Laurent Mazare2024-12-226-16/+76
* add scatter add (#2656)zachcp2024-12-011-0/+1
* add u32 - U32 gather (#2653)zachcp2024-11-301-0/+1
* Clippy fixes for the cuda feature. (#2650)Laurent Mazare2024-11-292-11/+11
* Lint fixes introduced with Rust 1.83 (#2646)Anubhab Bandyopadhyay2024-11-285-16/+16
* fix typo (#2606)Andrei Fajardo2024-11-231-1/+1
* 20241118 docs (#2629)zachcp2024-11-1922-12/+47
* Add max-all/min-all. (#2616)Laurent Mazare2024-11-141-0/+36
* Add some missing index-select metal kernels. (#2613)Laurent Mazare2024-11-121-1/+10
* Update docs (#2553)zachcp2024-11-111-0/+14
* Add some fast Metal MLX SDPA kernels (#2584)Eric Buehler2024-11-052-0/+12
* UG metal integration. (#2580)Laurent Mazare2024-10-275-10/+87
* Support for UG kernels. (#2579)Laurent Mazare2024-10-277-2/+137
* Testcases (#2567)Anubhab Bandyopadhyay2024-10-172-3/+278
* Switch to using the MLX matmul by default. (#2547)Laurent Mazare2024-10-061-3/+3
* Fix for cudnn bf16 conv2d. (#2535)Laurent Mazare2024-10-022-10/+14
* Add support for cuda streams. (#2532)Laurent Mazare2024-10-023-0/+24
* Efficient implementation of `Tensor::ones()` for `metal` (#2512)Anubhab Bandyopadhyay2024-10-012-4/+62
* Cuda quantized mmv bugfix. (#2526)Laurent Mazare2024-10-011-1/+25
* Yet another cuda qmm padding fix. (#2509)Laurent Mazare2024-09-301-25/+55
* Bugfix for the metal elu kernel. (#2490)Laurent Mazare2024-09-211-0/+13
* Metal commands refactoring (#2489)Laurent Mazare2024-09-212-99/+113
* Improve error message (#2485)ivnsch2024-09-201-1/+5
* Add a couple cast metal kernels. (#2479)Laurent Mazare2024-09-151-8/+31
* Export TensorIndexer public to candle users (#2477)Shengtuo Hu2024-09-131-1/+1
* Missing metal kernels. (#2474)Laurent Mazare2024-09-121-0/+2
* Hook the MLX matmul kernels in candle-core. (#2473)Laurent Mazare2024-09-122-0/+38
* Use the new MLX kernels to handle the BF16 matmul. (#2470)Laurent Mazare2024-09-112-26/+46
* Complete the missing backticks in the comments (#2469)hongmengning2024-09-111-0/+3
* Update cudarc to 0.12. (#2451)Laurent Mazare2024-08-272-2/+4
* Stream tensor (#2429)Laurent Mazare2024-08-172-0/+208
* Support Minus(u) for arbitrary values of u, e.g. Minus(3). (#2428)Laurent Mazare2024-08-171-0/+4
* Add documentation examples for `Tensor::i` and `Tensor::narrow` methods (#2308)Carsten Csiky2024-08-102-8/+169
* optimize gradient for silu a bit (#2393)MilkFather2024-08-041-2/+2
* Revert the bf16 gemm metal changes for now. (#2386)Laurent Mazare2024-08-011-2/+2
* Add a minimal test for the metal bf16 matmul. (#2381)Laurent Mazare2024-08-011-0/+20
* Enable BF16 on metal. (#2380)Laurent Mazare2024-08-011-0/+1
* Add get_ids to GradStore (#2379)Takanori MAEHARA2024-08-011-0/+5
* Use BF16 on metal when possible. (#2378)Laurent Mazare2024-08-011-0/+16
* Fix log_sum_exp to handle large positive/negative inputs (#2367)Yun-Jhong Wu2024-08-012-6/+34
* Enable the affine kernel for u8/u32. (#2376)Laurent Mazare2024-08-011-0/+2
* Add support for Llama 3.1 (#2359)Eric Buehler2024-07-265-10/+10
* Fix for backprop in ConvTranspose2D with stride of 2 (#2337)Ivor Wanders2024-07-172-2/+99
* Fix Elu gradient NaN on large input (#2328)Alexey Gerasev2024-07-161-1/+2
* Add a basic metal example with capture (#2324)Laurent Mazare2024-07-093-1/+39
* Fix a bug in the metal implemtation of col2im1d. (#2284)Laurent Mazare2024-06-221-1/+6
* Fix the fast bf16 gemm cublas kernels. (#2274)Laurent Mazare2024-06-184-12/+24