index
:
forks/candle.git
main
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
candle-core
/
src
/
quantized
/
mod.rs
Commit message (
Expand
)
Author
Age
Files
Lines
*
Add a forward_via_f16 method to the qmatmul op. (#2138)
Laurent Mazare
2024-04-28
1
-0
/
+19
*
Add the cuda dequantize f16 kernels. (#2137)
Laurent Mazare
2024-04-28
1
-4
/
+43
*
Fix dequantization. (#1823)
Laurent Mazare
2024-03-08
1
-1
/
+1
*
Cuda acceleration for quantized model. (#1754)
Laurent Mazare
2024-02-25
1
-4
/
+32
*
Qmetal tweaks (#1704)
Laurent Mazare
2024-02-13
1
-91
/
+12
*
Fixing quantized llama demo on metal. (#1703)
Nicolas Patry
2024-02-13
1
-0
/
+12
*
Quantized GGUF style (#1523)
Nicolas Patry
2024-01-17
1
-48
/
+254
*
Implement the module trait directly for QMatMul. (#1372)
Laurent Mazare
2023-11-25
1
-2
/
+2
*
Better control on the optional dequantization in QMatMul (#1049)
Laurent Mazare
2023-10-07
1
-7
/
+28
*
Improve the quantized whisper setup. (#1018)
Laurent Mazare
2023-10-02
1
-10
/
+19
*
simd128 optimized q8_0 vecdot (#972)
Laurent Mazare
2023-09-27
1
-0
/
+2
*
Add a quantized version of the t5 model. (#921)
Laurent Mazare
2023-09-21
1
-1
/
+1
*
Support for quantized tensors in the python api. (#706)
Laurent Mazare
2023-09-01
1
-3
/
+11
*
Llama quantization. (#625)
Laurent Mazare
2023-08-27
1
-0
/
+4
*
Add a function to write gguf files. (#585)
Laurent Mazare
2023-08-24
1
-1
/
+38
*
Preliminary GGUF support. (#557)
Laurent Mazare
2023-08-23
1
-0
/
+1
*
Add quantization support for `q2k`, `q3k`, `q4k` and `q5k` (#524)
Lukas Kreussel
2023-08-22
1
-0
/
+1
*
Neon support for quantization. (#519)
Laurent Mazare
2023-08-19
1
-0
/
+2
*
Add a simple Module trait and implement it for the various nn layers (#500)
Laurent Mazare
2023-08-18
1
-0
/
+1
*
Tensor -> QTensor conversion (#496)
Laurent Mazare
2023-08-18
1
-3
/
+40
*
Relax the requirements on CustomOp. (#486)
Laurent Mazare
2023-08-17
1
-3
/
+3
*
Move the avx specific bits to a separate file. (#481)
Laurent Mazare
2023-08-17
1
-0
/
+2
*
Get the ggml based llama to generate some text. (#464)
Laurent Mazare
2023-08-16
1
-13
/
+18
*
Add quantized tensors. (#458)
Laurent Mazare
2023-08-15
1
-1
/
+113
*
Split out the quantized file. (#456)
Laurent Mazare
2023-08-15
1
-0
/
+82