summaryrefslogtreecommitdiff
path: root/candle-transformers/src/models/quantized_mistral.rs
Commit message (Expand)AuthorAgeFilesLines
* Use cat for faster MQA computation. (#2043)Laurent Mazare2024-04-121-14/+2
* Use the new rope kernel in mistral. (#1937)Laurent Mazare2024-03-251-14/+6
* Support more mistral models. (#1927)Laurent Mazare2024-03-241-4/+5
* Avoid broadcasting on the batch dimension for the attention mask. (#1920)Laurent Mazare2024-03-231-4/+3
* Use the fast RmsNorm in the quantized model. (#1904)Laurent Mazare2024-03-211-0/+1
* feat: add clear_kv_cache to mistral and qmistral models (#1464)drbh2023-12-211-0/+14
* More model cloning. (#1126)Laurent Mazare2023-10-181-5/+5
* Move the common quantized-nn code to a shared module. (#1063)Laurent Mazare2023-10-091-40/+1
* Quantized version of mistral. (#1009)Laurent Mazare2023-09-301-0/+364