index
:
forks/candle.git
main
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
candle-core
/
src
/
quantized
Commit message (
Expand
)
Author
Age
Files
Lines
*
Add a Context trait similar to anyhow::Context. (#2676)
Laurent Mazare
2024-12-22
2
-4
/
+4
*
Clippy fixes for the cuda feature. (#2650)
Laurent Mazare
2024-11-29
1
-1
/
+1
*
Lint fixes introduced with Rust 1.83 (#2646)
Anubhab Bandyopadhyay
2024-11-28
2
-3
/
+3
*
20241118 docs (#2629)
zachcp
2024-11-19
3
-3
/
+3
*
Cuda quantized mmv bugfix. (#2526)
Laurent Mazare
2024-10-01
1
-1
/
+25
*
Yet another cuda qmm padding fix. (#2509)
Laurent Mazare
2024-09-30
1
-25
/
+55
*
Automatically upcast for to_u64 (#2244)
Eric Buehler
2024-06-04
1
-1
/
+7
*
Bump the version number to 0.5.1. (#2155)
Laurent Mazare
2024-05-03
1
-1
/
+0
*
Add a forward_via_f16 method to the qmatmul op. (#2138)
Laurent Mazare
2024-04-28
1
-0
/
+19
*
Add the cuda dequantize f16 kernels. (#2137)
Laurent Mazare
2024-04-28
3
-17
/
+122
*
Handle multiple dimensions in metal QMM + two fixes. (#2097)
Laurent Mazare
2024-04-20
1
-15
/
+20
*
Add more QMMV cuda kernels. (#2077)
Laurent Mazare
2024-04-18
1
-8
/
+10
*
Add the mmv kernels for small batch sizes. (#2075)
Laurent Mazare
2024-04-16
1
-18
/
+46
*
Fix for the batch dim in the quantized matmul example. (#2073)
Laurent Mazare
2024-04-15
1
-1
/
+1
*
Add a function to clear the KV cache in falcon. (#2066)
Laurent Mazare
2024-04-15
1
-0
/
+1
*
Faster kernels for quantized matmul on cuda (#2060)
Laurent Mazare
2024-04-15
1
-6
/
+137
*
Handle the batch dimension in quantized MMV on metal. (#2022)
Laurent Mazare
2024-04-06
1
-1
/
+4
*
Quantized cuda tweaks. (#1981)
Laurent Mazare
2024-04-01
1
-89
/
+62
*
Switch the default to using the faster kernels. (#1978)
Laurent Mazare
2024-04-01
1
-1
/
+1
*
More ggml cuda kernels (#1977)
Laurent Mazare
2024-04-01
1
-7
/
+147
*
Properly handle the batch dimension in cuda quantized matmul. (#1832)
Laurent Mazare
2024-03-10
1
-1
/
+1
*
Fix dequantization. (#1823)
Laurent Mazare
2024-03-08
1
-1
/
+1
*
Improve metal buffer usage (#1807)
ivarflakstad
2024-03-07
1
-2
/
+7
*
Handle Q5_0 and Q5_1 quants in cuda.
laurent
2024-02-29
1
-16
/
+38
*
Fix the block size for some cuda kernels. (#1767)
Laurent Mazare
2024-02-27
1
-13
/
+15
*
Cuda kernel for dequantizing q8k. (#1760)
Laurent Mazare
2024-02-26
1
-18
/
+16
*
Cuda acceleration for quantized model. (#1754)
Laurent Mazare
2024-02-25
6
-48
/
+430
*
Qmetal tweaks (#1704)
Laurent Mazare
2024-02-13
3
-100
/
+141
*
Fixing quantized llama demo on metal. (#1703)
Nicolas Patry
2024-02-13
3
-0
/
+19
*
Quantized GGUF style (#1523)
Nicolas Patry
2024-01-17
4
-82
/
+485
*
Bugfix for dequantizing q5k layers. (#1569)
Laurent Mazare
2024-01-11
1
-4
/
+4
*
Simpler repro for the neon optimization issue + bugfix (#1544)
Laurent Mazare
2024-01-07
1
-152
/
+56
*
Fix the quantized mistral example. (#1478)
Laurent Mazare
2023-12-25
1
-1
/
+1
*
Fix a couple typos (#1451)
Laurent Mazare
2023-12-17
2
-3
/
+3
*
Implement the module trait directly for QMatMul. (#1372)
Laurent Mazare
2023-11-25
1
-2
/
+2
*
Allow using gguf-v3 files. (#1262)
Laurent Mazare
2023-11-03
1
-5
/
+15
*
No need for the even constraint on vecdot-q40-q80. (#1202)
Laurent Mazare
2023-10-28
4
-41
/
+2
*
Add a quantized variant of llama2.c (#1197)
Laurent Mazare
2023-10-27
2
-28
/
+2
*
Better control on the optional dequantization in QMatMul (#1049)
Laurent Mazare
2023-10-07
1
-7
/
+28
*
Simd128 optimized q8k vecdot. (#1026)
Laurent Mazare
2023-10-03
2
-0
/
+33
*
AVX optimized q8k vecdot. (#1024)
Laurent Mazare
2023-10-03
2
-0
/
+35
*
neon optimized q8k multiplication. (#1021)
Laurent Mazare
2023-10-02
2
-3
/
+36
*
Add the q8k vec-dot multiplication. (#1019)
Laurent Mazare
2023-10-02
1
-2
/
+18
*
Improve the quantized whisper setup. (#1018)
Laurent Mazare
2023-10-02
1
-10
/
+19
*
Improve the testing of the optimized quantized vec-dot ops (#1016)
Laurent Mazare
2023-10-02
1
-2
/
+60
*
Simd128 version of q6k vec-dot. (#1015)
Laurent Mazare
2023-10-01
2
-1
/
+127
*
Simd128 version of the q2k-q8k vecdot product. (#1011)
Laurent Mazare
2023-09-30
2
-45
/
+75
*
Simd128 q2k vecdot (#982)
Laurent Mazare
2023-09-28
2
-4
/
+57
*
Sketch a simd128 optimized q4k vecdot. (#977)
Laurent Mazare
2023-09-27
2
-1
/
+97
*
Simd128 vec-dot for q4_0. (#974)
Laurent Mazare
2023-09-27
2
-1
/
+54
[next]