summaryrefslogtreecommitdiff
path: root/candle-core/tests/quantized_tests.rs
Commit message (Expand)AuthorAgeFilesLines
* Add the cuda dequantize f16 kernels. (#2137)Laurent Mazare2024-04-281-1/+120
* Add more QMMV cuda kernels. (#2077)Laurent Mazare2024-04-181-7/+15
* Add the mmv kernels for small batch sizes. (#2075)Laurent Mazare2024-04-161-1/+35
* Fix for the batch dim in the quantized matmul example. (#2073)Laurent Mazare2024-04-151-36/+36
* Add a cuda kernel for dequantizing q8_0. (#1804)Laurent Mazare2024-03-051-4/+0
* Handle Q5_0 and Q5_1 quants in cuda.laurent2024-02-291-8/+0
* Fix the block size for some cuda kernels. (#1767)Laurent Mazare2024-02-271-32/+0
* Quantized GGUF style (#1523)Nicolas Patry2024-01-171-143/+430
* Bugfix for dequantizing q5k layers. (#1569)Laurent Mazare2024-01-111-1/+1
* Simpler repro for the neon optimization issue + bugfix (#1544)Laurent Mazare2024-01-071-16/+41
* Implement the module trait directly for QMatMul. (#1372)Laurent Mazare2023-11-251-1/+1
* Add the q8k vec-dot multiplication. (#1019)Laurent Mazare2023-10-021-0/+28
* Improve the quantized whisper setup. (#1018)Laurent Mazare2023-10-021-7/+7
* Improve the testing of the optimized quantized vec-dot ops (#1016)Laurent Mazare2023-10-021-3/+8
* Simd128 version of the q2k-q8k vecdot product. (#1011)Laurent Mazare2023-09-301-1/+1
* Move the test-utils bits to a shared place. (#619)Laurent Mazare2023-08-271-2/+1
* Fix for q5_1 quantization. (#617)Laurent Mazare2023-08-271-48/+27
* Quantization tests + fix some issues. (#616)Laurent Mazare2023-08-271-0/+93
* Add reference implementation for `q4k` and `q5k` (#586)Lukas Kreussel2023-08-261-1/+93
* Referenze implementations of `q2k` and `q3k` vec-dot functions (#580)Lukas Kreussel2023-08-241-0/+54
* Cosmetic tweaks. (#570)Laurent Mazare2023-08-231-29/+24
* Mirror GGML's unit tests (#569)Lukas Kreussel2023-08-231-16/+124
* Add quantization support for `q2k`, `q3k`, `q4k` and `q5k` (#524)Lukas Kreussel2023-08-221-21/+174
* Tensor -> QTensor conversion (#496)Laurent Mazare2023-08-181-2/+45
* Q6K quantization (#495)Laurent Mazare2023-08-171-0/+26
* AVX version of the vecdot for q4_0. (#474)Laurent Mazare2023-08-171-10/+10
* Add vecdot for q6k-q8k. (#476)Laurent Mazare2023-08-161-0/+22
* Add a quantized test that use negative values. (#470)Laurent Mazare2023-08-161-0/+50
* Get the ggml based llama to generate some text. (#464)Laurent Mazare2023-08-161-3/+32
* Add a test for qmatmul. (#459)Laurent Mazare2023-08-161-0/+13
* Split out the quantized file. (#456)Laurent Mazare2023-08-151-0/+33