index
:
forks/candle.git
main
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
candle-core
/
src
/
quantized
Commit message (
Expand
)
Author
Age
Files
Lines
...
*
simd128 optimized q8_0 vecdot (#972)
Laurent Mazare
2023-09-27
3
-0
/
+54
*
Use the gelu-erf activation. (#969)
Laurent Mazare
2023-09-26
1
-3
/
+3
*
Avoid some overflows on wasm32. (#968)
Laurent Mazare
2023-09-26
2
-3
/
+14
*
Add a quantized version of the t5 model. (#921)
Laurent Mazare
2023-09-21
1
-1
/
+1
*
Fix some errors about BlockQ8_1 (#776)
zmlcc
2023-09-08
1
-3
/
+5
*
Add `ggufv2` support (#725)
Lukas Kreussel
2023-09-03
1
-21
/
+97
*
Support for quantized tensors in the python api. (#706)
Laurent Mazare
2023-09-01
1
-3
/
+11
*
Small cleanups (avoid some possible mutations) (#670)
Laurent Mazare
2023-08-30
1
-99
/
+59
*
Neon optimized vecdot (#666)
Laurent Mazare
2023-08-29
2
-8
/
+369
*
Add `avx` implemenetations of `q2k`, `q3k` and `q5k` vec-dot functions (#654)
Lukas Kreussel
2023-08-29
2
-8
/
+403
*
AVX version of the q4k vecdot. (#651)
Laurent Mazare
2023-08-29
2
-9
/
+120
*
Neon optimized version of the q4k vecdot product. (#632)
Laurent Mazare
2023-08-27
2
-1
/
+99
*
Llama quantization. (#625)
Laurent Mazare
2023-08-27
1
-0
/
+4
*
Add the quantize command. (#624)
Laurent Mazare
2023-08-27
1
-1
/
+2
*
Fix for q5_1 quantization. (#617)
Laurent Mazare
2023-08-27
1
-1
/
+1
*
Quantization tests + fix some issues. (#616)
Laurent Mazare
2023-08-27
1
-6
/
+6
*
More missing quantized bits. (#615)
Laurent Mazare
2023-08-27
1
-7
/
+94
*
Missing quants ops (#611)
Laurent Mazare
2023-08-26
1
-13
/
+123
*
Another transmute tweak. (#610)
Laurent Mazare
2023-08-26
1
-20
/
+19
*
Avoid using tmp values. (#609)
Laurent Mazare
2023-08-26
1
-20
/
+8
*
Add reference implementation for `q4k` and `q5k` (#586)
Lukas Kreussel
2023-08-26
1
-4
/
+177
*
Avoid some transmutes. (#607)
Laurent Mazare
2023-08-25
1
-10
/
+5
*
Neon intrinsics for the q8_0 vecdot. (#604)
Laurent Mazare
2023-08-25
2
-0
/
+64
*
AVX version for the q8-0 multiplications. (#598)
Laurent Mazare
2023-08-25
2
-1
/
+23
*
Generic implementation of vecdot for q80. (#596)
Laurent Mazare
2023-08-25
1
-2
/
+18
*
Add a function to write gguf files. (#585)
Laurent Mazare
2023-08-24
2
-4
/
+163
*
Referenze implementations of `q2k` and `q3k` vec-dot functions (#580)
Lukas Kreussel
2023-08-24
1
-7
/
+179
*
GGUF support in the quantized model. (#559)
Laurent Mazare
2023-08-23
1
-2
/
+88
*
Handle GGUF files in tensor-tools. (#558)
Laurent Mazare
2023-08-23
1
-2
/
+10
*
Preliminary GGUF support. (#557)
Laurent Mazare
2023-08-23
2
-0
/
+221
*
Avoid some mutable variables (take 2). (#554)
Laurent Mazare
2023-08-22
2
-37
/
+29
*
Revert "Avoid some mut in quantized functions. (#550)" (#552)
Laurent Mazare
2023-08-22
2
-30
/
+39
*
Avoid some mut in quantized functions. (#550)
Laurent Mazare
2023-08-22
2
-39
/
+30
*
Add quantization support for `q2k`, `q3k`, `q4k` and `q5k` (#524)
Lukas Kreussel
2023-08-22
3
-399
/
+901
*
Neon support for quantization. (#519)
Laurent Mazare
2023-08-19
3
-0
/
+228
*
Basic `qmatmul` parallelization (#492)
Lukas Kreussel
2023-08-18
1
-5
/
+15
*
Add a simple Module trait and implement it for the various nn layers (#500)
Laurent Mazare
2023-08-18
1
-0
/
+1
*
Tensor -> QTensor conversion (#496)
Laurent Mazare
2023-08-18
2
-4
/
+41
*
Q6K quantization (#495)
Laurent Mazare
2023-08-17
1
-2
/
+207
*
AVX version of the q6k vec-dot. (#493)
Laurent Mazare
2023-08-17
2
-1
/
+104
*
Relax the requirements on CustomOp. (#486)
Laurent Mazare
2023-08-17
1
-3
/
+3
*
Move the avx specific bits to a separate file. (#481)
Laurent Mazare
2023-08-17
3
-116
/
+119
*
AVX version of the vecdot for q4_0. (#474)
Laurent Mazare
2023-08-17
1
-0
/
+75
*
Add vecdot for q6k-q8k. (#476)
Laurent Mazare
2023-08-16
1
-2
/
+56
*
Use a zipped iterator. (#475)
Laurent Mazare
2023-08-16
1
-11
/
+54
*
Add a kv-cache to the quantized llama example. (#466)
Laurent Mazare
2023-08-16
1
-4
/
+4
*
Get the ggml based llama to generate some text. (#464)
Laurent Mazare
2023-08-16
3
-23
/
+39
*
Add quantized tensors. (#458)
Laurent Mazare
2023-08-15
2
-106
/
+139
*
Quantized support for f16 and f32 (#457)
Laurent Mazare
2023-08-15
1
-0
/
+74
*
Split out the quantized file. (#456)
Laurent Mazare
2023-08-15
3
-0
/
+1104
[prev]