Support embedding model gte-Qwen1.5-7B-instruct (#2190)

diff options

author	Yin Guobing <yinguobing@gmail.com>	2024-05-17 03:34:10 +0800
committer	GitHub <noreply@github.com>	2024-05-16 21:34:10 +0200
commit	349c3e806a15399df8289c41b2e24c3fa24b6d84 (patch)
tree	c0e0f625c115b3e97c04ab9281122d814ad027db /candle-onnx
parent	bdaa34216a2bb3527b6e248030f434561f9cf620 (diff)
download	candle-349c3e806a15399df8289c41b2e24c3fa24b6d84.tar.gz candle-349c3e806a15399df8289c41b2e24c3fa24b6d84.tar.bz2 candle-349c3e806a15399df8289c41b2e24c3fa24b6d84.zip

* Support embedding model gte-Qwen1.5-7B-instruct This is a text embedding model based on Qwen2. They share same model architecture except the last MLP module. This commit brings in minimal modification of the old Qwen2 implementation to support both models. An example is provided, and had been verified according to the official PyTorch implementation. * Avoid doing the 'last-token filtering' based on the absence of attention mask. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>

Diffstat (limited to 'candle-onnx')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: