forks/candle.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Module Docs (#2624)	zachcp	2024-11-18	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* update whisper * update llama2c * update t5 * update phi and t5 * add a blip model * qlamma doc * add two new docs * add docs and emoji * additional models * openclip * pixtral * edits on the model docs * update yu * update a fe wmore models * add persimmon * add model-level doc * names * update module doc * links in heira * remove empty URL * update more hyperlinks * updated hyperlinks * more links * Update mod.rs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
*	More Model Module Docs (#2623)	zachcp	2024-11-17	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* dinov2 * add another example * ad dinov2reg4 * eva2 * efficientvit * moondream * update t5 * update t5 * rwkv * stable diffusion docs * add wasm link * add segment_anything * adjsut for clippy * ignore bertdoc * dinov2 ignore * update block to be text * remove the rust blocks for the moment * bump python to 3.11 * add a setup-python step * add py311 to test as well
*	Documentation Pass for Models (#2617)	zachcp	2024-11-15	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* links in chinese_clip * links for clip model * add mod docs for flux and llava * module doc for MMDIT and MIMI * add docs for a few more modesl * mod docs for bert naser and beit * add module docs for convmixer colpali codegeex and chatglm * add another series of moddocs * add fastvit-llama2_c * module docs mamba -> mobileone * module docs from moondream-phi3 * mod docs for quantized and qwen * update to yi * fix long names * Update llama2_c.rs * Update llama2_c_weights.rs * Fix the link for mimi + tweaks --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
*	Lazy upcasting for t5. (#2589)	Laurent Mazare	2024-10-30	1	-3/+48
\|
*	Clippy fixes for 1.81.0. (#2461)	Laurent Mazare	2024-09-05	1	-1/+1
\| \| \| \| \|	* Clippy fixes for 1.81.0. * Another fix.
*	Add support for Llama 3.1 (#2359)	Eric Buehler	2024-07-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add Llama 3.1 rope * Clippy * Format * Clippy * Add support for multiple eos tokens: * Untagged either * Remove either dep and fix settings.json * Make the max positional embeddings configurable
*	Avoid crashes when running T5 models with F16 tensors on CPU (#2047)	Victor-Mihaila	2024-04-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	* This change avoids crashes when running T5 models with F16 tensors on CPU. * This enables running ProstT5's (https://huggingface.co/Rostlab/ProstT5) encoder-only mode in Candle. This ProstT5 mode stores it's embed_tokens weights within the encoder, as its decoding stage was replaced with a CNN. You could write more, like: This alone is not sufficient to run ProstT5 within Candle examples. We will develop a ProstT5 runner outside candle for now, but would be willing to upstream it to candle-examples at a later point. * Revert "This enables running ProstT5's (https://huggingface.co/Rostlab/ProstT5) encoder-only mode in Candle. This ProstT5 mode stores it's embed_tokens weights within the encoder, as its decoding stage was replaced with a CNN. You could write more, like: This alone is not sufficient to run ProstT5 within Candle examples. We will develop a ProstT5 runner outside candle for now, but would be willing to upstream it to candle-examples at a later point." This reverts commit d886d3ce5e3f1504934f4f6f7cf86108b7efd191.
*	Change for the encoder-only ProstT5 model (#2045)	Victor-Mihaila	2024-04-13	1	-1/+3
\| \| \| \| \|	* This change avoids crashes when running T5 models with F16 tensors on CPU. * This enables running ProstT5's (https://huggingface.co/Rostlab/ProstT5) encoder-only mode in Candle. This ProstT5 mode stores it's embed_tokens weights within the encoder, as its decoding stage was replaced with a CNN. This alone is not sufficient to run ProstT5 within Candle examples. We will develop a ProstT5 runner outside candle for now, but would be willing to upstream it to candle-examples at a later point.
*	Expose the t5 config fields + allow t5-large. (#1987)	Laurent Mazare	2024-04-01	1	-16/+16
\|
*	Add min to buckets in relative_position_bucket (#1312)	Andy Braga	2023-11-10	1	-1/+1
\|
*	Add support to UL2 model family (#1300)	Juarez Bochi	2023-11-09	1	-6/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add support to UL2 model family * Update docs with UL2 * Create ActivationWithOptionalGating to avoid polluting activations * Also refactor quantized t5 * Remove useless conversion * Revert Activation::NewGelu name change * Remove useless return * Apply rustfmt and clippy recommendations * Reuse t5::ActivationWithOptionalGating in quantized version * (cosmetic change) use a match rather than ifs + avoid early returns. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
*	Fix bug introduced in madlad PR (#1298)	Juarez Bochi	2023-11-08	1	-2/+2
\|
*	Add support for MADLAD400 (#1285)	Juarez Bochi	2023-11-07	1	-2/+15
\| \| \| \| \|	* Add support for madlad * Add support for quantized MADLAD
*	Make more models cloneable. (#1203)	Laurent Mazare	2023-10-28	1	-11/+11
\|
*	Use softmax-last-dim where possible. (#1057)	Laurent Mazare	2023-10-08	1	-1/+1
\|
*	Do not use the kv-cache on external key-value states. (#1054)	Laurent Mazare	2023-10-07	1	-7/+7
\|
*	Delete invalid comment (#1038)	Juarez Bochi	2023-10-05	1	-2/+0
\|
*	Tracing for the phi model (#936)	Laurent Mazare	2023-09-23	1	-56/+15
\| \| \| \| \| \| \| \| \|	* Add some tracing bits to mixformers. * Add the missing file. * Add the conv2d layer to with-tracing. * Improve the tracing usage.
*	Add a quantized version of the t5 model. (#921)	Laurent Mazare	2023-09-21	1	-1/+1
\|
*	Add a clear cache function to the t5 model. (#919)	Laurent Mazare	2023-09-21	1	-0/+30
\|
*	Add more t5 tracing. (#915)	Laurent Mazare	2023-09-20	1	-4/+17
\|
*	Add more t5 tracing. (#914)	Laurent Mazare	2023-09-20	1	-5/+35
\| \| \| \| \|	* Add more t5 tracing. * Rever the sm change.
*	Tracing mode for T5. (#913)	Laurent Mazare	2023-09-20	1	-16/+74
\| \| \| \| \|	* Tracing mode for T5. * Tracing for the linear layer.
*	Flan T5: Read lm_head when word embeddings are not tied (#903)	Juarez Bochi	2023-09-19	1	-7/+43
\| \| \| \| \| \| \|	* Read lm_head when word embeddings are not tied * Fix formatting * Address comments
*	Fix T5 kv cache (#899)	Juarez Bochi	2023-09-19	1	-1/+6
\| \| \| \| \| \| \|	* Fix T5 kv cache * Add argument for decoder prompt * Fix range
*	Avoid re-encoding the input in the T5 example. (#875)	Laurent Mazare	2023-09-17	1	-3/+15
\|
*	Add a KV cache to T5. (#873)	Laurent Mazare	2023-09-17	1	-27/+58
\| \| \| \| \| \| \| \| \|	* Add a KV cache to T5. * Suggest using release mode. * Use the kv cache in decoding. * Add a comment.
*	Implement T5 decoding (#864)	Juarez Bochi	2023-09-15	1	-28/+153
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Load t5 decoder * Run enc, dec, and lm head, but no cross attn * Cross-attention over key_value_states * New arg for decoder input ids * Add mask, don't forward position biases through decoder * Update t5 examples * Clippy + rustfmt
*	Add support to flan-t5 (#840)	Juarez Bochi	2023-09-13	1	-5/+49
\|
*	Add some sentence similarity part to the t5 example. (#835)	Laurent Mazare	2023-09-13	1	-2/+5
\| \| \| \| \|	* Add some sentence similarity part to the t5 example. * Clippy fix.
*	T5 tweaks (#831)	Laurent Mazare	2023-09-13	1	-18/+33
\| \| \| \| \| \| \|	* Use default values rather than options. * Avoid exposing the device field. * More tweaks.
*	Clippy fix. (#830)	Laurent Mazare	2023-09-13	1	-2/+0
\|
*	Extract T5 module and add main function to use it (#829)	Juarez Bochi	2023-09-13	1	-0/+441
	* Extract t5 out of musicgen * Add main for t5 module