forks/candle.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Support for the new Qwen2 models. (#2257)	Laurent Mazare	2024-06-07	1	-10/+26
\| \| \| \| \|	* Support for the new Qwen2 models. * Add more models.
*	Support embedding model gte-Qwen1.5-7B-instruct (#2190)	Yin Guobing	2024-05-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Support embedding model gte-Qwen1.5-7B-instruct This is a text embedding model based on Qwen2. They share same model architecture except the last MLP module. This commit brings in minimal modification of the old Qwen2 implementation to support both models. An example is provided, and had been verified according to the official PyTorch implementation. * Avoid doing the 'last-token filtering' based on the absence of attention mask. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
*	Readme fix. (#1961)	Laurent Mazare	2024-03-28	1	-1/+1
\|
*	Qwen MoE model. (#1960)	Laurent Mazare	2024-03-28	2	-4/+61
\| \| \| \| \| \| \| \| \| \| \|	* Qwen MoE model. * Add the MoE model to the example. * Fix the scaling. * Readme updates. * Readme tweaks.
*	Fixing the qwen tokenizer location. (#1693)	Nicolas Patry	2024-02-11	1	-3/+1
\| \| \| \|	Using the chatglm one causes a bug where the "<\|endoftext\|>" is not found.
*	ChatGLM custom tokenizer. (#1687)	Laurent Mazare	2024-02-10	1	-1/+3
\|
*	Use the proper endoftext token for gwen. (#1685)	Laurent Mazare	2024-02-09	1	-2/+2
\|
*	Add the Qwen2 model (#1684)	Laurent Mazare	2024-02-09	1	-0/+281
	* Initial check-in for the qwen2 model. * More qwen2 inference. * Polish the qwen example. * Fix the rope basis. * Get the inference to work. * Support different model sizes.