summaryrefslogtreecommitdiff
path: root/candle-examples/examples/qwen
Commit message (Collapse)AuthorAgeFilesLines
* Support for the new Qwen2 models. (#2257)Laurent Mazare2024-06-071-10/+26
| | | | | * Support for the new Qwen2 models. * Add more models.
* Support embedding model gte-Qwen1.5-7B-instruct (#2190)Yin Guobing2024-05-161-1/+1
| | | | | | | | | | | | | | | | | * Support embedding model gte-Qwen1.5-7B-instruct This is a text embedding model based on Qwen2. They share same model architecture except the last MLP module. This commit brings in minimal modification of the old Qwen2 implementation to support both models. An example is provided, and had been verified according to the official PyTorch implementation. * Avoid doing the 'last-token filtering' based on the absence of attention mask. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Readme fix. (#1961)Laurent Mazare2024-03-281-1/+1
|
* Qwen MoE model. (#1960)Laurent Mazare2024-03-282-4/+61
| | | | | | | | | | | * Qwen MoE model. * Add the MoE model to the example. * Fix the scaling. * Readme updates. * Readme tweaks.
* Fixing the qwen tokenizer location. (#1693)Nicolas Patry2024-02-111-3/+1
| | | | Using the chatglm one causes a bug where the "<|endoftext|>" is not found.
* ChatGLM custom tokenizer. (#1687)Laurent Mazare2024-02-101-1/+3
|
* Use the proper endoftext token for gwen. (#1685)Laurent Mazare2024-02-091-2/+2
|
* Add the Qwen2 model (#1684)Laurent Mazare2024-02-091-0/+281
* Initial check-in for the qwen2 model. * More qwen2 inference. * Polish the qwen example. * Fix the rope basis. * Get the inference to work. * Support different model sizes.