| Commit message (Expand) | Author | Age | Files | Lines |
* | Add a Context trait similar to anyhow::Context. (#2676) | Laurent Mazare | 2024-12-22 | 1 | -2/+2 |
* | 20241118 docs (#2629) | zachcp | 2024-11-19 | 1 | -0/+1 |
* | Add a forward_via_f16 method to the qmatmul op. (#2138) | Laurent Mazare | 2024-04-28 | 1 | -0/+19 |
* | Add the cuda dequantize f16 kernels. (#2137) | Laurent Mazare | 2024-04-28 | 1 | -4/+43 |
* | Fix dequantization. (#1823) | Laurent Mazare | 2024-03-08 | 1 | -1/+1 |
* | Cuda acceleration for quantized model. (#1754) | Laurent Mazare | 2024-02-25 | 1 | -4/+32 |
* | Qmetal tweaks (#1704) | Laurent Mazare | 2024-02-13 | 1 | -91/+12 |
* | Fixing quantized llama demo on metal. (#1703) | Nicolas Patry | 2024-02-13 | 1 | -0/+12 |
* | Quantized GGUF style (#1523) | Nicolas Patry | 2024-01-17 | 1 | -48/+254 |
* | Implement the module trait directly for QMatMul. (#1372) | Laurent Mazare | 2023-11-25 | 1 | -2/+2 |
* | Better control on the optional dequantization in QMatMul (#1049) | Laurent Mazare | 2023-10-07 | 1 | -7/+28 |
* | Improve the quantized whisper setup. (#1018) | Laurent Mazare | 2023-10-02 | 1 | -10/+19 |
* | simd128 optimized q8_0 vecdot (#972) | Laurent Mazare | 2023-09-27 | 1 | -0/+2 |
* | Add a quantized version of the t5 model. (#921) | Laurent Mazare | 2023-09-21 | 1 | -1/+1 |
* | Support for quantized tensors in the python api. (#706) | Laurent Mazare | 2023-09-01 | 1 | -3/+11 |
* | Llama quantization. (#625) | Laurent Mazare | 2023-08-27 | 1 | -0/+4 |
* | Add a function to write gguf files. (#585) | Laurent Mazare | 2023-08-24 | 1 | -1/+38 |
* | Preliminary GGUF support. (#557) | Laurent Mazare | 2023-08-23 | 1 | -0/+1 |
* | Add quantization support for `q2k`, `q3k`, `q4k` and `q5k` (#524) | Lukas Kreussel | 2023-08-22 | 1 | -0/+1 |
* | Neon support for quantization. (#519) | Laurent Mazare | 2023-08-19 | 1 | -0/+2 |
* | Add a simple Module trait and implement it for the various nn layers (#500) | Laurent Mazare | 2023-08-18 | 1 | -0/+1 |
* | Tensor -> QTensor conversion (#496) | Laurent Mazare | 2023-08-18 | 1 | -3/+40 |
* | Relax the requirements on CustomOp. (#486) | Laurent Mazare | 2023-08-17 | 1 | -3/+3 |
* | Move the avx specific bits to a separate file. (#481) | Laurent Mazare | 2023-08-17 | 1 | -0/+2 |
* | Get the ggml based llama to generate some text. (#464) | Laurent Mazare | 2023-08-16 | 1 | -13/+18 |
* | Add quantized tensors. (#458) | Laurent Mazare | 2023-08-15 | 1 | -1/+113 |
* | Split out the quantized file. (#456) | Laurent Mazare | 2023-08-15 | 1 | -0/+82 |