Commit message (Expand) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Quantized moondream implementation and BOS token (#1980) | Santiago Medina | 2024-04-01 | 1 | -0/+24 |
* | Quantized GGUF style (#1523) | Nicolas Patry | 2024-01-17 | 1 | -2/+2 |
* | Quantized version for phi-v2. (#1430) | Laurent Mazare | 2023-12-13 | 1 | -0/+18 |
* | More model cloning. (#1126) | Laurent Mazare | 2023-10-18 | 1 | -7/+7 |
* | Move the common quantized-nn code to a shared module. (#1063) | Laurent Mazare | 2023-10-09 | 1 | -34/+3 |
* | Use softmax-last-dim where possible. (#1057) | Laurent Mazare | 2023-10-08 | 1 | -1/+1 |
* | Expose a function to clear the KV cache on mixformers. (#964) | Laurent Mazare | 2023-09-26 | 1 | -0/+12 |
* | Add the quantized mixformer model. (#953) | Laurent Mazare | 2023-09-24 | 1 | -0/+344 |