| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
| |
* fix FLUX.1 weights
* added flux1-dev.safetensors
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Allow loading images with given std and mean
* OpenCLIP text encoder component
* Two MobileCLIP models
* Clippy fixes.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
|
|
|
|
|
| |
* correct optional SE layer dimensions.
* head_dim instead of num_heads is 32.
* update test example output.
|
| |
|
|
|
|
|
| |
* Fix for parler-tts, do not add the last slice of padding tokens.
* Support for the mini model.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* silero-vad v5 example
This change adds an example of how to run silero-vad v5
* PR: rename 'vad' to 'silero-vad'
* Update README.md
---------
Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
* Add a readme for the parler-tts example.
* Remove the python decode script.
* mp4 tweaks.
* Another readme tweak.
|
|
|
|
|
|
|
|
|
| |
* Add the DAC model.
* More quantization support.
* Handle DAC decoding.
* Plug the DAC decoding in parler-tts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Start sketching parler-tts support.
* Implement the attention.
* Add the example code.
* Fix the example.
* Add the description + t5 encode it.
* More of the parler forward pass.
* Fix the positional embeddings.
* Support random sampling in generation.
* Handle EOS.
* Add the python decoder.
* Proper causality mask.
|
|
|
|
|
| |
* Fix the marian tokenizer importer.
* Ignore the python caches.
|
|
|
|
|
|
|
|
|
|
|
| |
* Add gemma-2.
* Support a couple more models.
* Sliding window support.
* Example + readme updates.
* Update the main readme.
|
| |
|
| |
|
| |
|
|
|
| |
Also squeeze the first dimension of the codes tensor in the example file to get the expected three dimensions.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* add models support and example for THUDM/glm-4
* fix the ci report
* fmt
* fix
* Update README.org
* Update README.org
* fmt
* Update README.org
* README.md add codegeex4
* README.md add glm4
* Typo.
* change expect into ?
---------
Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add the flux autoencoder.
* Add the encoder down-blocks.
* Upsampling in the decoder.
* Sketch the flow matching model.
* More flux model.
* Add some of the positional embeddings.
* Add the rope embeddings.
* Add the sampling functions.
* Add the flux example.
* Fix the T5 bits.
* Proper T5 tokenizer.
* Clip encoder path fix.
* Get the clip embeddings.
* No configurable weights in layer norm.
* More weights related fixes.
* Yet another shape fix.
* DType fix.
* Fix a couple more shape issues.
* DType fixes.
* Fix the latent dims.
* Fix more shape issues.
* Autoencoder fixes.
* Get some generations out.
* Bugfix.
* T5 padding.
* Clippy fix.
* Add the decode only mode.
* Fix.
* More fixes.
* Finally get some generations to work.
* Add readme.
|
|
|
|
|
|
|
| |
* Fix cargo fmt.
* Clippy fix.
* Cosmetic tweaks.
|
|
|
|
|
|
|
| |
* fix: fix jina bert example logic
* feat: enable jina embeddings de
* feat: allow more flexibility on Jina Bert
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* bert attention mask
* Allow for using None as a mask.
* Revert part of the changes so that the proper default mask applies.
* Cosmetic change.
* Another cosmetic tweak.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add Llama 3.1 rope
* Clippy
* Format
* Clippy
* Add support for multiple eos tokens:
* Untagged either
* Remove either dep and fix settings.json
* Make the max positional embeddings configurable
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* onnx: fix pad, unsqueeze
both implementations have off-by-one errors:
- Pad 'reflect' cycle for eg `dim==3` is `[0,1,2,1]` which has length of
4 (or `dim*2 - 2`) not 5 (current code `dim*2 - 1`)
- Unsqueeze(-1) for tensor with `dim==3` should be 3 (ie `dim+index+1`)
not 2 (ie currently `dim+index`)
in addition, Pad is incorrectly calculating the starting padding.
If we want to pad out 2 elements to the start, and we have this cycle
of indices of length 6, then we should skip 4 elements, but currently
we skip 2. A more visual representation of what's going on is below:
```
pad_start: 2
data: [a,b,c,d]
indices: [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1, 0, ..] // zigzag between 0..4
actual: skip [ c d| c b a b]
expected: ~ skip ~ [ c b| a b c d]
```
The values between `[` and `|` are padding and the values between
`|` and `]` in the example should match the original data being padded.
* Fix clippy lints.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* feat(candle-transformers/models/codegeex4-9b): add codegeex4-9b transoformers
* change mod.rs
* feat(candle-examples/codegeex4-9b)
* Update codegeex4_9b.rs
* Update main.rs
* Update codegeex4_9b.rs
* Update main.rs
* fmt
* fix
* fmt
* Clippy fix.
* Remove some print statements.
* Avoid using unwrap.
* 1. add README
2. change the print fmt
* Another clippy fix.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
|
| |
|
| |
|
|
|
|
|
| |
* add quantized version of qwen2 and corresponding example for qwen2-instruct
* fix quantized qwen2 clippy error
|
|
|
|
|
|
|
| |
* Support different resolutions in load_image()
* Added MobilenetV4 model.
* Add MobileNetv4 to README
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add EVA-02 model ( https://arxiv.org/abs/2303.11331 )
* Clippy fix.
* And apply fmt.
---------
Co-authored-by: v-espitalier <>
Co-authored-by: Laurent <laurent.mazare@gmail.com>
|
|
|
| |
Co-authored-by: v-espitalier <>
|
|
|
| |
Co-authored-by: v-espitalier <>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add: DINOv2Reg4 with PlantCLEF2024 weights and example ( See https://arxiv.org/abs/2309.16588 and https://zenodo.org/records/10848263 )
* Remove extra files + update README to download them + remove extra lines
* minor fix (README remove extra spaces)
* minor fix (README: Fix image url)
* Modif: Add back interpolate_pos_encoding() + fix when no interpolation + remove extra comments + Update README ( source image changed and so the predictions )
* Fix: Improve code lisibility with '$ cargo clippy' and '$ cargo fmt'
* Another clippy fix.
---------
Co-authored-by: x-VEspit <vincent.espitalier@cirad.fr>
Co-authored-by: laurent <laurent.mazare@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* define structs
* construct ResidualConvUnit
* forward() for ResidualConvUnit
* implement FeatureFusionBlock
* implement Scratch
* implement DPTHead
* add identity module
* implement forward for DTPHead
* add get_intermediate_layers to DinoVisionTransformer
* implement DepthAnythingV2
* some minor tweaks
* fix compile errors
* fix var builder prefixes
* setup initial example
* use fixed patch size of 37 (518 / 14)
* debugged until output
* print min and max values
* add some dynamism to the output location
* scale input image
* extract prep function
* extract output path function
* normalize image with magic mean and std
* add spectral coloring
* squeeze in the right place
* make enterpolation optional
* use bail instead of panic
* omit unnecessary Shape call
* remove empty curly braces
* use bail instead of assert
* use vb and pp
* remove closures
* extract config object
* Apply rustfmt.
* Fix some clippy lints.
* More lints.
* Use the array methods.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com>
|
|
|
|
|
| |
* Support for the new Qwen2 models.
* Add more models.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* first commit
* llava
* clippy and fmt
* some fixes
* minor fixes
* remove useless file
* refactor: Remove llava/constants.rs and update llava/mod.rs
* modify variable name
* modify code after clippy
* Minor tweaks.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com>
|
| |
|
| |
|
|
|
|
|
| |
* Use flash-attn in gemma.
* Fix flash-attn for head dim 256.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add a slice_set op.
* Add some testing.
* Add the dedicated kv-cache module.
* Derive debug and clone.
* Expose more kv-cache functions.
* Return the current data when appending.
* Use the new cache in the quantized phi3 model.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Support embedding model gte-Qwen1.5-7B-instruct
This is a text embedding model based on Qwen2. They share same
model architecture except the last MLP module. This commit brings in
minimal modification of the old Qwen2 implementation to support both
models.
An example is provided, and had been verified according to the official
PyTorch implementation.
* Avoid doing the 'last-token filtering' based on the absence of attention mask.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
(#2187)
Threshold is 0.0 by default, negative values make more points included,
expanding the mask. Positive values make it more picky, making the mask
smaller.
Negative numbers start with a minus sign, which normally makes clap
consider it a flag.
|