index
:
forks/candle.git
main
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
candle-core
/
src
/
quantized
/
cuda.rs
Commit message (
Expand
)
Author
Age
Files
Lines
*
Clippy fixes for the cuda feature. (#2650)
Laurent Mazare
2024-11-29
1
-1
/
+1
*
Cuda quantized mmv bugfix. (#2526)
Laurent Mazare
2024-10-01
1
-1
/
+25
*
Yet another cuda qmm padding fix. (#2509)
Laurent Mazare
2024-09-30
1
-25
/
+55
*
Add the cuda dequantize f16 kernels. (#2137)
Laurent Mazare
2024-04-28
1
-13
/
+75
*
Add more QMMV cuda kernels. (#2077)
Laurent Mazare
2024-04-18
1
-8
/
+10
*
Add the mmv kernels for small batch sizes. (#2075)
Laurent Mazare
2024-04-16
1
-18
/
+46
*
Fix for the batch dim in the quantized matmul example. (#2073)
Laurent Mazare
2024-04-15
1
-1
/
+1
*
Add a function to clear the KV cache in falcon. (#2066)
Laurent Mazare
2024-04-15
1
-0
/
+1
*
Faster kernels for quantized matmul on cuda (#2060)
Laurent Mazare
2024-04-15
1
-6
/
+137
*
Quantized cuda tweaks. (#1981)
Laurent Mazare
2024-04-01
1
-89
/
+62
*
Switch the default to using the faster kernels. (#1978)
Laurent Mazare
2024-04-01
1
-1
/
+1
*
More ggml cuda kernels (#1977)
Laurent Mazare
2024-04-01
1
-7
/
+147
*
Properly handle the batch dimension in cuda quantized matmul. (#1832)
Laurent Mazare
2024-03-10
1
-1
/
+1
*
Handle Q5_0 and Q5_1 quants in cuda.
laurent
2024-02-29
1
-16
/
+38
*
Fix the block size for some cuda kernels. (#1767)
Laurent Mazare
2024-02-27
1
-13
/
+15
*
Cuda kernel for dequantizing q8k. (#1760)
Laurent Mazare
2024-02-26
1
-18
/
+16
*
Cuda acceleration for quantized model. (#1754)
Laurent Mazare
2024-02-25
1
-0
/
+321