summaryrefslogtreecommitdiff
path: root/candle-transformers/src/models/t5.rs
Commit message (Collapse)AuthorAgeFilesLines
* Module Docs (#2624)zachcp2024-11-181-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * update whisper * update llama2c * update t5 * update phi and t5 * add a blip model * qlamma doc * add two new docs * add docs and emoji * additional models * openclip * pixtral * edits on the model docs * update yu * update a fe wmore models * add persimmon * add model-level doc * names * update module doc * links in heira * remove empty URL * update more hyperlinks * updated hyperlinks * more links * Update mod.rs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* More Model Module Docs (#2623)zachcp2024-11-171-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * dinov2 * add another example * ad dinov2reg4 * eva2 * efficientvit * moondream * update t5 * update t5 * rwkv * stable diffusion docs * add wasm link * add segment_anything * adjsut for clippy * ignore bertdoc * dinov2 ignore * update block to be text * remove the rust blocks for the moment * bump python to 3.11 * add a setup-python step * add py311 to test as well
* Documentation Pass for Models (#2617)zachcp2024-11-151-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * links in chinese_clip * links for clip model * add mod docs for flux and llava * module doc for MMDIT and MIMI * add docs for a few more modesl * mod docs for bert naser and beit * add module docs for convmixer colpali codegeex and chatglm * add another series of moddocs * add fastvit-llama2_c * module docs mamba -> mobileone * module docs from moondream-phi3 * mod docs for quantized and qwen * update to yi * fix long names * Update llama2_c.rs * Update llama2_c_weights.rs * Fix the link for mimi + tweaks --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* Lazy upcasting for t5. (#2589)Laurent Mazare2024-10-301-3/+48
|
* Clippy fixes for 1.81.0. (#2461)Laurent Mazare2024-09-051-1/+1
| | | | | * Clippy fixes for 1.81.0. * Another fix.
* Add support for Llama 3.1 (#2359)Eric Buehler2024-07-261-1/+1
| | | | | | | | | | | | | | | | | * Add Llama 3.1 rope * Clippy * Format * Clippy * Add support for multiple eos tokens: * Untagged either * Remove either dep and fix settings.json * Make the max positional embeddings configurable
* Avoid crashes when running T5 models with F16 tensors on CPU (#2047)Victor-Mihaila2024-04-131-1/+1
| | | | | | | | | * This change avoids crashes when running T5 models with F16 tensors on CPU. * This enables running ProstT5's (https://huggingface.co/Rostlab/ProstT5) encoder-only mode in Candle. This ProstT5 mode stores it's embed_tokens weights within the encoder, as its decoding stage was replaced with a CNN. You could write more, like: This alone is not sufficient to run ProstT5 within Candle examples. We will develop a ProstT5 runner outside candle for now, but would be willing to upstream it to candle-examples at a later point. * Revert "This enables running ProstT5's (https://huggingface.co/Rostlab/ProstT5) encoder-only mode in Candle. This ProstT5 mode stores it's embed_tokens weights within the encoder, as its decoding stage was replaced with a CNN. You could write more, like: This alone is not sufficient to run ProstT5 within Candle examples. We will develop a ProstT5 runner outside candle for now, but would be willing to upstream it to candle-examples at a later point." This reverts commit d886d3ce5e3f1504934f4f6f7cf86108b7efd191.
* Change for the encoder-only ProstT5 model (#2045)Victor-Mihaila2024-04-131-1/+3
| | | | | * This change avoids crashes when running T5 models with F16 tensors on CPU. * This enables running ProstT5's (https://huggingface.co/Rostlab/ProstT5) encoder-only mode in Candle. This ProstT5 mode stores it's embed_tokens weights within the encoder, as its decoding stage was replaced with a CNN. This alone is not sufficient to run ProstT5 within Candle examples. We will develop a ProstT5 runner outside candle for now, but would be willing to upstream it to candle-examples at a later point.
* Expose the t5 config fields + allow t5-large. (#1987)Laurent Mazare2024-04-011-16/+16
|
* Add min to buckets in relative_position_bucket (#1312)Andy Braga2023-11-101-1/+1
|
* Add support to UL2 model family (#1300)Juarez Bochi2023-11-091-6/+43
| | | | | | | | | | | | | | | | | | | | | | | | | * Add support to UL2 model family * Update docs with UL2 * Create ActivationWithOptionalGating to avoid polluting activations * Also refactor quantized t5 * Remove useless conversion * Revert Activation::NewGelu name change * Remove useless return * Apply rustfmt and clippy recommendations * Reuse t5::ActivationWithOptionalGating in quantized version * (cosmetic change) use a match rather than ifs + avoid early returns. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Fix bug introduced in madlad PR (#1298)Juarez Bochi2023-11-081-2/+2
|
* Add support for MADLAD400 (#1285)Juarez Bochi2023-11-071-2/+15
| | | | | * Add support for madlad * Add support for quantized MADLAD
* Make more models cloneable. (#1203)Laurent Mazare2023-10-281-11/+11
|
* Use softmax-last-dim where possible. (#1057)Laurent Mazare2023-10-081-1/+1
|
* Do not use the kv-cache on external key-value states. (#1054)Laurent Mazare2023-10-071-7/+7
|
* Delete invalid comment (#1038)Juarez Bochi2023-10-051-2/+0
|
* Tracing for the phi model (#936)Laurent Mazare2023-09-231-56/+15
| | | | | | | | | * Add some tracing bits to mixformers. * Add the missing file. * Add the conv2d layer to with-tracing. * Improve the tracing usage.
* Add a quantized version of the t5 model. (#921)Laurent Mazare2023-09-211-1/+1
|
* Add a clear cache function to the t5 model. (#919)Laurent Mazare2023-09-211-0/+30
|
* Add more t5 tracing. (#915)Laurent Mazare2023-09-201-4/+17
|
* Add more t5 tracing. (#914)Laurent Mazare2023-09-201-5/+35
| | | | | * Add more t5 tracing. * Rever the sm change.
* Tracing mode for T5. (#913)Laurent Mazare2023-09-201-16/+74
| | | | | * Tracing mode for T5. * Tracing for the linear layer.
* Flan T5: Read lm_head when word embeddings are not tied (#903)Juarez Bochi2023-09-191-7/+43
| | | | | | | * Read lm_head when word embeddings are not tied * Fix formatting * Address comments
* Fix T5 kv cache (#899)Juarez Bochi2023-09-191-1/+6
| | | | | | | * Fix T5 kv cache * Add argument for decoder prompt * Fix range
* Avoid re-encoding the input in the T5 example. (#875)Laurent Mazare2023-09-171-3/+15
|
* Add a KV cache to T5. (#873)Laurent Mazare2023-09-171-27/+58
| | | | | | | | | * Add a KV cache to T5. * Suggest using release mode. * Use the kv cache in decoding. * Add a comment.
* Implement T5 decoding (#864)Juarez Bochi2023-09-151-28/+153
| | | | | | | | | | | | | | | * Load t5 decoder * Run enc, dec, and lm head, but no cross attn * Cross-attention over key_value_states * New arg for decoder input ids * Add mask, don't forward position biases through decoder * Update t5 examples * Clippy + rustfmt
* Add support to flan-t5 (#840)Juarez Bochi2023-09-131-5/+49
|
* Add some sentence similarity part to the t5 example. (#835)Laurent Mazare2023-09-131-2/+5
| | | | | * Add some sentence similarity part to the t5 example. * Clippy fix.
* T5 tweaks (#831)Laurent Mazare2023-09-131-18/+33
| | | | | | | * Use default values rather than options. * Avoid exposing the device field. * More tweaks.
* Clippy fix. (#830)Laurent Mazare2023-09-131-2/+0
|
* Extract T5 module and add main function to use it (#829)Juarez Bochi2023-09-131-0/+441
* Extract t5 out of musicgen * Add main for t5 module