Commit message (Expand) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Handle multiple dimensions in metal QMM + two fixes. (#2097) | Laurent Mazare | 2024-04-20 | 1 | -15/+20 |
* | Handle the batch dimension in quantized MMV on metal. (#2022) | Laurent Mazare | 2024-04-06 | 1 | -1/+4 |
* | Improve metal buffer usage (#1807) | ivarflakstad | 2024-03-07 | 1 | -2/+7 |
* | Cuda acceleration for quantized model. (#1754) | Laurent Mazare | 2024-02-25 | 1 | -35/+18 |
* | Qmetal tweaks (#1704) | Laurent Mazare | 2024-02-13 | 1 | -9/+86 |
* | Fixing quantized llama demo on metal. (#1703) | Nicolas Patry | 2024-02-13 | 1 | -0/+4 |
* | Quantized GGUF style (#1523) | Nicolas Patry | 2024-01-17 | 1 | -0/+153 |