summaryrefslogtreecommitdiff
path: root/candle-core/src/quantized/k_quants.rs
Commit message (Expand)AuthorAgeFilesLines
* Bugfix for dequantizing q5k layers. (#1569)Laurent Mazare2024-01-111-4/+4
* No need for the even constraint on vecdot-q40-q80. (#1202)Laurent Mazare2023-10-281-5/+0
* Simd128 optimized q8k vecdot. (#1026)Laurent Mazare2023-10-031-0/+3
* AVX optimized q8k vecdot. (#1024)Laurent Mazare2023-10-031-0/+3
* neon optimized q8k multiplication. (#1021)Laurent Mazare2023-10-021-3/+7
* Add the q8k vec-dot multiplication. (#1019)Laurent Mazare2023-10-021-2/+18
* Improve the testing of the optimized quantized vec-dot ops (#1016)Laurent Mazare2023-10-021-2/+60
* Simd128 version of q6k vec-dot. (#1015)Laurent Mazare2023-10-011-0/+3
* Simd128 version of the q2k-q8k vecdot product. (#1011)Laurent Mazare2023-09-301-5/+3
* Simd128 q2k vecdot (#982)Laurent Mazare2023-09-281-0/+3
* Sketch a simd128 optimized q4k vecdot. (#977)Laurent Mazare2023-09-271-0/+3
* Simd128 vec-dot for q4_0. (#974)Laurent Mazare2023-09-271-0/+3
* simd128 optimized q8_0 vecdot (#972)Laurent Mazare2023-09-271-0/+3
* Fix some errors about BlockQ8_1 (#776)zmlcc2023-09-081-3/+5
* Neon optimized vecdot (#666)Laurent Mazare2023-08-291-0/+9
* Add `avx` implemenetations of `q2k`, `q3k` and `q5k` vec-dot functions (#654)Lukas Kreussel2023-08-291-0/+12
* AVX version of the q4k vecdot. (#651)Laurent Mazare2023-08-291-0/+3
* Neon optimized version of the q4k vecdot product. (#632)Laurent Mazare2023-08-271-0/+4
* Fix for q5_1 quantization. (#617)Laurent Mazare2023-08-271-1/+1
* Quantization tests + fix some issues. (#616)Laurent Mazare2023-08-271-6/+6
* More missing quantized bits. (#615)Laurent Mazare2023-08-271-7/+94
* Missing quants ops (#611)Laurent Mazare2023-08-261-13/+123
* Another transmute tweak. (#610)Laurent Mazare2023-08-261-20/+19
* Avoid using tmp values. (#609)Laurent Mazare2023-08-261-20/+8
* Add reference implementation for `q4k` and `q5k` (#586)Lukas Kreussel2023-08-261-4/+177
* Avoid some transmutes. (#607)Laurent Mazare2023-08-251-10/+5
* Neon intrinsics for the q8_0 vecdot. (#604)Laurent Mazare2023-08-251-0/+3
* AVX version for the q8-0 multiplications. (#598)Laurent Mazare2023-08-251-0/+4
* Generic implementation of vecdot for q80. (#596)Laurent Mazare2023-08-251-2/+18
* Referenze implementations of `q2k` and `q3k` vec-dot functions (#580)Lukas Kreussel2023-08-241-7/+179
* Avoid some mutable variables (take 2). (#554)Laurent Mazare2023-08-221-23/+15
* Revert "Avoid some mut in quantized functions. (#550)" (#552)Laurent Mazare2023-08-221-16/+25
* Avoid some mut in quantized functions. (#550)Laurent Mazare2023-08-221-25/+16
* Add quantization support for `q2k`, `q3k`, `q4k` and `q5k` (#524)Lukas Kreussel2023-08-221-399/+574
* Neon support for quantization. (#519)Laurent Mazare2023-08-191-0/+6
* Basic `qmatmul` parallelization (#492)Lukas Kreussel2023-08-181-5/+15
* Q6K quantization (#495)Laurent Mazare2023-08-171-2/+207
* AVX version of the q6k vec-dot. (#493)Laurent Mazare2023-08-171-0/+4
* Move the avx specific bits to a separate file. (#481)Laurent Mazare2023-08-171-116/+45
* AVX version of the vecdot for q4_0. (#474)Laurent Mazare2023-08-171-0/+75
* Add vecdot for q6k-q8k. (#476)Laurent Mazare2023-08-161-2/+56
* Use a zipped iterator. (#475)Laurent Mazare2023-08-161-11/+54
* Add a kv-cache to the quantized llama example. (#466)Laurent Mazare2023-08-161-4/+4
* Get the ggml based llama to generate some text. (#464)Laurent Mazare2023-08-161-6/+7
* Quantized support for f16 and f32 (#457)Laurent Mazare2023-08-151-0/+74
* Split out the quantized file. (#456)Laurent Mazare2023-08-151-0/+728