forks/candle.git -

	Commit message (Expand)	Author	Age	Files	Lines
*	Clippy fixes for the cuda feature. (#2650)	Laurent Mazare	2024-11-29	1	-1/+1
*	Cuda quantized mmv bugfix. (#2526)	Laurent Mazare	2024-10-01	1	-1/+25
*	Yet another cuda qmm padding fix. (#2509)	Laurent Mazare	2024-09-30	1	-25/+55
*	Add the cuda dequantize f16 kernels. (#2137)	Laurent Mazare	2024-04-28	1	-13/+75
*	Add more QMMV cuda kernels. (#2077)	Laurent Mazare	2024-04-18	1	-8/+10
*	Add the mmv kernels for small batch sizes. (#2075)	Laurent Mazare	2024-04-16	1	-18/+46
*	Fix for the batch dim in the quantized matmul example. (#2073)	Laurent Mazare	2024-04-15	1	-1/+1
*	Add a function to clear the KV cache in falcon. (#2066)	Laurent Mazare	2024-04-15	1	-0/+1
*	Faster kernels for quantized matmul on cuda (#2060)	Laurent Mazare	2024-04-15	1	-6/+137
*	Quantized cuda tweaks. (#1981)	Laurent Mazare	2024-04-01	1	-89/+62
*	Switch the default to using the faster kernels. (#1978)	Laurent Mazare	2024-04-01	1	-1/+1
*	More ggml cuda kernels (#1977)	Laurent Mazare	2024-04-01	1	-7/+147
*	Properly handle the batch dimension in cuda quantized matmul. (#1832)	Laurent Mazare	2024-03-10	1	-1/+1
*	Handle Q5_0 and Q5_1 quants in cuda.	laurent	2024-02-29	1	-16/+38
*	Fix the block size for some cuda kernels. (#1767)	Laurent Mazare	2024-02-27	1	-13/+15
*	Cuda kernel for dequantizing q8k. (#1760)	Laurent Mazare	2024-02-26	1	-18/+16
*	Cuda acceleration for quantized model. (#1754)	Laurent Mazare	2024-02-25	1	-0/+321