forks/candle.git -

	Commit message (Expand)	Author	Age	Files	Lines
*	20241118 docs (#2629)	zachcp	2024-11-19	1	-0/+2
*	Support for UG kernels. (#2579)	Laurent Mazare	2024-10-27	1	-0/+21
*	Fix for cudnn bf16 conv2d. (#2535)	Laurent Mazare	2024-10-02	2	-10/+14
*	Add support for cuda streams. (#2532)	Laurent Mazare	2024-10-02	1	-0/+14
*	Update cudarc to 0.12. (#2451)	Laurent Mazare	2024-08-27	2	-2/+4
*	Fix the fast bf16 gemm cublas kernels. (#2274)	Laurent Mazare	2024-06-18	1	-5/+3
*	Add the layernorm specialized op. (#2212)	Laurent Mazare	2024-05-24	2	-1/+39
*	More efficient cuda implementation for ConvTranspose1d. (#2211)	Laurent Mazare	2024-05-24	1	-2/+73
*	Make it possible to use TF32 accumulation in F32 matmuls. (#2178)	Laurent Mazare	2024-05-11	1	-6/+61
*	Bump the version number to 0.5.1. (#2155)	Laurent Mazare	2024-05-03	1	-38/+0
*	F16/BF16 bugfix (bis). (#2143)	Laurent Mazare	2024-04-29	1	-14/+36
*	Bugfix the recent f16/bf16 changes. (#2142)	Laurent Mazare	2024-04-29	1	-8/+8
*	Fix sigmoid gradient calculation and move sigmoid into a specialized op (#2114)	MilkFather	2024-04-29	1	-2/+2
*	Add a toggle for F16/BF16 accumulation in gemm. (#2141)	Laurent Mazare	2024-04-29	1	-12/+125
*	Add StorageRef. (#2113)	Laurent Mazare	2024-04-23	1	-1/+38
*	Add a synchronize method to devices. (#2055)	Laurent Mazare	2024-04-14	1	-0/+5
*	Split the cuda error file. (#2003)	Laurent Mazare	2024-04-04	2	-65/+67
*	Relax the contiguous check for cuda kernels. (#2000)	Laurent Mazare	2024-04-03	1	-1/+6
*	Improve the handling of matmul with squeezed layouts. (#1998)	Laurent Mazare	2024-04-02	1	-0/+4
*	Backend refactoring. (#1966)	Laurent Mazare	2024-03-29	4	-0/+2576