| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
|
| |
* correct optional SE layer dimensions.
* head_dim instead of num_heads is 32.
* update test example output.
|
|
|
|
|
|
|
|
|
| |
* Add a readme for the parler-tts example.
* Remove the python decode script.
* mp4 tweaks.
* Another readme tweak.
|
|
|
|
|
|
|
|
|
|
|
| |
* Add gemma-2.
* Support a couple more models.
* Sliding window support.
* Example + readme updates.
* Update the main readme.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* add models support and example for THUDM/glm-4
* fix the ci report
* fmt
* fix
* Update README.org
* Update README.org
* fmt
* Update README.org
* README.md add codegeex4
* README.md add glm4
* Typo.
* change expect into ?
---------
Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
|
| |
|
|
|
|
|
|
|
| |
* Support different resolutions in load_image()
* Added MobilenetV4 model.
* Add MobileNetv4 to README
|
| |
|
|
|
| |
Signed-off-by: hardlydearly <799511800@qq.com>
|
| |
|
| |
|
|
|
| |
Co-authored-by: Jane Doe <jane.doe@example.org>
|
|
|
|
|
|
|
| |
* Add a quantized version of recurrent-gemma.
* Share the rglru part.
* Get the quantized gemma model to work.
|
|
|
|
|
| |
* Added link to the coursera ML algo implementations
* Fixed link
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* moondream implementation
* add moondream example
* change config default activation
* Add assets and integrate phi mixformer with example
* Make use of kv cache and fix seq_len bug; Clean up example code
* Add README link to example
* Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig
* Delete image
* Use apply instead of forward
|
|
|
|
|
|
|
|
|
|
|
| |
* Qwen MoE model.
* Add the MoE model to the example.
* Fix the scaling.
* Readme updates.
* Readme tweaks.
|
|
|
| |
Adds the candle-einops to the readme as an external resource
|
| |
|
| |
|
|
|
|
|
| |
* Add EfficientVit (Microsoft Research Asia) model.
* Mention models in README
|
|
|
|
|
|
|
| |
* Add the StarCoder2 model.
* Add the example code and get things to work.
* And also tweak the readme.
|
|
|
|
|
| |
* Add a flag to force running the quantized model on CPUs.
* Add encodec to the readme.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add the Gemma models.
* Add the gemma example.
* Adapt the RmsNorm.
* Get the 2b model to work.
* 7b support.
* Use the config head dim.
* Yet another fix.
* Make the matrixes contiguous.
* Also get the 7b model to work.
* And add to the readme.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Start adding the RWKV model.
* More of the forward step.
* Handle rescaling.
* FeedForward.
* More work on RWKV.
* Better state tracking.
* Finish a first pass on forward.
* Fix the shape mismatches.
* Do not rescale in f32.
* Rename to rwkv-v5.
* Add the new models to the readme.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Sketch the mamba model for inference.
* Complete the forward pass.
* Add the mamba example.
* Optimize the selective-scan part.
* Fix a couple shape mismatches and get inference to work.
* Tweak the readmes.
* More readme tweaks.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
* Mixtral quantized instruct.
* Fix a couple typos.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Add support for SD Turbo
* Set Leading as default in euler_ancestral discrete
* Use the appropriate default values for n_steps and guidance_scale.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
|
|
|
|
|
| |
* Demonstrate lora transformers in readme
* Shorten readme
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add support to UL2 model family
* Update docs with UL2
* Create ActivationWithOptionalGating to avoid polluting activations
* Also refactor quantized t5
* Remove useless conversion
* Revert Activation::NewGelu name change
* Remove useless return
* Apply rustfmt and clippy recommendations
* Reuse t5::ActivationWithOptionalGating in quantized version
* (cosmetic change) use a match rather than ifs + avoid early returns.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
|
|
|
| |
I think it makes more sense to have it there, since it's a seq2seq model with cross attention, and not a LM. There are also Decoder only T5 models that work as LMs, but that's not the standard.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
* Put the onnx example behind a feature flag.
* Exclude the onnx bits from the workspace.
* README tweaks.
|
|
|
| |
Co-authored-by: figgefigge <fredric.1337mail.com>
|