summaryrefslogtreecommitdiff
path: root/candle-examples/src/token_output_stream.rs
Commit message (Collapse)AuthorAgeFilesLines
* Add the StarCoder2 model. (#1779)Laurent Mazare2024-02-281-1/+1
| | | | | | | * Add the StarCoder2 model. * Add the example code and get things to work. * And also tweak the readme.
* Fix token generation in bilingual models (non-English outputs) (#1668)Guoqing Bao2024-02-061-1/+1
| | | Co-authored-by: Guoqing Bao <guoqing.bao@enflame-tech.com>
* Quantized version of mistral. (#1009)Laurent Mazare2023-09-301-2/+14
| | | | | | | | | | | | | | | * Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.
* Streaming mode for reporting the generated tokens (#1007)Laurent Mazare2023-09-301-0/+74
* Token streaming. * Use the token output stream. * Flush the output. * Ensure that the last characters get reported.