diff options
author | Yin Guobing <yinguobing@gmail.com> | 2024-05-17 03:34:10 +0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-05-16 21:34:10 +0200 |
commit | 349c3e806a15399df8289c41b2e24c3fa24b6d84 (patch) | |
tree | c0e0f625c115b3e97c04ab9281122d814ad027db /candle-onnx | |
parent | bdaa34216a2bb3527b6e248030f434561f9cf620 (diff) | |
download | candle-349c3e806a15399df8289c41b2e24c3fa24b6d84.tar.gz candle-349c3e806a15399df8289c41b2e24c3fa24b6d84.tar.bz2 candle-349c3e806a15399df8289c41b2e24c3fa24b6d84.zip |
Support embedding model gte-Qwen1.5-7B-instruct (#2190)
* Support embedding model gte-Qwen1.5-7B-instruct
This is a text embedding model based on Qwen2. They share same
model architecture except the last MLP module. This commit brings in
minimal modification of the old Qwen2 implementation to support both
models.
An example is provided, and had been verified according to the official
PyTorch implementation.
* Avoid doing the 'last-token filtering' based on the absence of attention mask.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
Diffstat (limited to 'candle-onnx')
0 files changed, 0 insertions, 0 deletions