summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Sync upstream MLX sdpa vector kernels with mask (#2718)HEADmainEric Buehler2025-01-163-49/+486
| | | | | | | * Sync upstream mlx sdpa vector kernels with mask * Dispatch to the 2pass kernel * Format
* Bump the ug dependency. (#2720)Laurent Mazare2025-01-162-4/+4
| | | | | | | * Bump the ug dependency. * Fix some test. * Fix the ug test.
* Fix the helium weights download. (#2717)Laurent Mazare2025-01-131-1/+1
|
* Helium repo update. (#2716)Laurent Mazare2025-01-132-2/+8
|
* Add the helium model. (#2715)Laurent Mazare2025-01-134-0/+699
|
* Fixes for running Phi-4 quantized. (#2714)Jani Monoses2025-01-132-2/+6
|
* ModernBERT model (#2713)Jani Monoses2025-01-136-1/+612
| | | | | | | | | | | * layer_norm_no_bias * Modernbert model. * Format + cleanup error. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
* Clippy fixes for 1.84. (#2710)Laurent Mazare2025-01-102-6/+3
|
* Update cudarc. (#2708)Laurent Mazare2025-01-081-1/+1
|
* Bump the caret version to 0.8.2. (#2703)Laurent Mazare2025-01-075-16/+16
|
* add link to README (#2701)Andrei Fajardo2025-01-041-0/+1
|
* Fix mistral attention on Metal (#2699)Luka Zakrajšek2025-01-041-1/+2
| | | Co-authored-by: Luka Zakrajsek <luka.zakrajsek@soniox.com>
* UniPC for diffusion sampling (#2684)Nick Senger2025-01-016-5/+1011
| | | | | | | | | | | | | | | | | | | * feat: Add unipc multistep scheduler * chore: Clippy and formatting * chore: Update comments * chore: Avoid unsafety in float ordering * refactor: Update Scheduler::step mutability requirements * fix: Corrector img2img * chore: Update unipc ref link to latest diffusers release * chore: Deduplicate float ordering * fix: Panic when running with dev profile
* Update the hf-hub dependency to 0.4.0. (#2691)Laurent Mazare2024-12-312-5/+5
| | | | | | | * Update the hf-hub dependency to 0.4.0. * Fix the book. * Use 0.4.1.
* Actually remove the default hf-hub cache path for glm. (#2696)Laurent Mazare2024-12-311-1/+1
|
* Use the default hf-hub cache for glm. (#2695)Laurent Mazare2024-12-311-7/+10
|
* Flash-Attn upgrade / SoftCap Candle-FlashAttn [3/n] (#2690)Michael Feil2024-12-313-4/+7
| | | | | | | | | | | | | | | * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace * softcap is working, including test and api. * make softcap test case better * unpadded lse added
* Flash-Attn upgrade / SoftCap Candle-FlashAttn [2/n] (#2689)Michael Feil2024-12-314-3/+182
| | | | | | | | | | | | | | | | | * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace * softcap is working, including test and api. * make softcap test case better --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
* Flash-Attn upgrade / SoftCap Candle-FlashAttn [1/n] (#2688)Michael Feil2024-12-3141-82/+139
| | | | | | | | | * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace
* Streamline the glm4 example. (#2694)Laurent Mazare2024-12-313-147/+99
|
* Fix a cuda warning. (#2693)Laurent Mazare2024-12-311-39/+44
|
* Update README.org (#2670)jetsung2024-12-301-1/+1
| | | The command line error in the CPU section of the documentation.
* Added XLMRobertaModel for Reranking (#2686)Akshay Ballal2024-12-304-0/+853
| | | | | | | | | | | | | | | | * add xlm-roberta-base * Add task enum for fill-mask and reranker in xlm-roberta example; update README and fix attention mask dimensions - Introduced a new `Task` enum to replace string task identifiers in the xlm-roberta example. - Updated the logic in `main.rs` to handle tasks using the new enum. - Enhanced README with example output for fill-mask task. - Fixed dimension retrieval in `prepare_4d_attention_mask` function for better clarity and safety. * Clippy fix. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
* Fix bug in whisper transformer (#2681)mert-kurttutan2024-12-241-0/+1
| | | | | | | | | | | * Fix bug in whisper transformer - due to num_threads going to zero in single threaded case * Apply rustfmt. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Fix Batcher iterator break when return_last_incomplete_batch and ↵hhllhhyyds2024-12-241-4/+4
| | | | items.is_empty (#2654) (#2655)
* Fix position encodings for Pixtral (#2678)Amélie Royer2024-12-231-13/+55
| | | | | | | | | * init commit: add position id in meshgrid * pass in subsampled positions * clippy fix * clippy fix
* Add a Context trait similar to anyhow::Context. (#2676)Laurent Mazare2024-12-2213-41/+97
| | | | | * Add a Context trait similar to anyhow::Context. * Switch two unwrap to context.
* make DepthAnythingV2 more reusable (#2675)Edgar Riba2024-12-212-23/+27
| | | | | | | | | * make DepthAnythingV2 more reusable * Fix clippy lints. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
* Bump the crate version to 0.8.1. (#2662)Laurent Mazare2024-12-075-16/+16
|
* Change/bert encoder public (#2658)Justin Sing2024-12-041-21/+30
| | | | | | | | | | | | | | | * change: BertEncoder struct to public * change: make certain fields in Config struct public * change: all fields in bert config struct to be public * change: add clone to bert encoder and others * Clippy fix. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Add Nvembed v2 model (#2649)cdoko2024-12-036-0/+803
| | | | | | | | | | | | | | | | | | | | | | | * Update mod.rs * Create mod.rs * Create decoder.rs * Create model.rs * Create main.rs * Create README.md * Update README.md * Update main.rs * Update and rename decoder.rs to embedding.rs * Update mod.rs * Update model.rs
* add scatter add (#2656)zachcp2024-12-012-0/+2
|
* add u32 - U32 gather (#2653)zachcp2024-11-302-79/+81
|
* Clippy fixes for the cuda feature. (#2650)Laurent Mazare2024-11-292-11/+11
|
* Adds support for stella_en_v5 embedding model -400M variant (#2608)iskng2024-11-293-112/+555
| | | | | | | | | | | | | | | | | | | | * Adds support for stella_en_v5 embedding model -400M variant * Unified stella * WIP: Unified Stella * Combined stella for both 1.5B and 400M variants * Cargo fmt for the CI * removed redundant stella-400m model and example after merge into stella-en-v5 * cargo fmt --all --------- Co-authored-by: Anubhab Bandyopadhyay <4890833+AnubhabB@users.noreply.github.com> Co-authored-by: laurent <laurent.mazare@gmail.com>
* Lint fixes introduced with Rust 1.83 (#2646)Anubhab Bandyopadhyay2024-11-2819-55/+57
| | | | | | | | | | | * Fixes for lint errors introduced with Rust 1.83 * rustfmt * Fix more lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Fix for whisper-microphone example failure if audio isn't chunk aligned (#2645)Adam Nelson2024-11-271-3/+17
| | | | | | | | | | | | | | | | | | | | | At least on my macOS Sequoia system (MBP 14" 2021, M1 Pro), when I run the `whisper-microphone` example after it has gathered 10 seconds of audio, it fails before the transcription: ``` Error: Insufficient buffer size 384 for input channel 0, expected 1024 ``` At least for the audio device I'm using (Airpods Pro Max), there is no guarantee that each audio buffer is a multiple of 1024 samples. Thus at the end of the 10 seconds, `buffered_pcm` can have some samples at the end that do not form a complete 1024 sample chunk. This fixes that by tracking when there is a partial chunk at the end of the buffer, and leaving it in `buffered_pcm` to be processed on the next loop iteration. Note that, in the interest of keeping this PR as small as possible, I didn't make any other changes to this example.
* Onnx Support for Sign operation #2641 (#2642)Ionut Mihalcea2024-11-262-0/+47
| | | | | | | | | * Support for Sign operation #2641 * Apply rustfmt. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Provide a method to allow PTH files with state maps to be loaded. (#2639)zachcp2024-11-261-1/+11
| | | | | | | * Provide a method to allow PTH files iwth state maps to be loaded. * add a line to the doc * String-. &str
* fix typo (#2606)Andrei Fajardo2024-11-231-1/+1
|
* Tweak the CI to avoid running out of disk space. (#2630)Laurent Mazare2024-11-191-0/+3
| | | | | * Tweak the CI to avoid running out of disk space. * Linux only.
* 20241118 docs (#2629)zachcp2024-11-1927-12/+72
| | | | | | | | | | | | | | | | | * module docs * varbuilder gguf docs * add a link to gguf files * small additonal mod doc titles * safetensor docs * more core docs * more module docs in canlde_core * 2 more link fixes
* Import the ggml_cuda_dp4a function. (#2628)Laurent Mazare2024-11-191-33/+44
|
* Fix for clippy. (#2626)Laurent Mazare2024-11-181-1/+1
|
* Module Docs (#2624)zachcp2024-11-1839-115/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * update whisper * update llama2c * update t5 * update phi and t5 * add a blip model * qlamma doc * add two new docs * add docs and emoji * additional models * openclip * pixtral * edits on the model docs * update yu * update a fe wmore models * add persimmon * add model-level doc * names * update module doc * links in heira * remove empty URL * update more hyperlinks * updated hyperlinks * more links * Update mod.rs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* More Model Module Docs (#2623)zachcp2024-11-1712-72/+291
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * dinov2 * add another example * ad dinov2reg4 * eva2 * efficientvit * moondream * update t5 * update t5 * rwkv * stable diffusion docs * add wasm link * add segment_anything * adjsut for clippy * ignore bertdoc * dinov2 ignore * update block to be text * remove the rust blocks for the moment * bump python to 3.11 * add a setup-python step * add py311 to test as well
* Module Docs (#2620)zachcp2024-11-165-10/+126
| | | | | | | | | | | * update bert docs * update based * update bigcode * add pixtral * add flux as well
* Remove some unused macros. (#2618)Laurent Mazare2024-11-159-14/+13
| | | | | * Remove some unused macros. * More unused fixes.
* Documentation Pass for Models (#2617)zachcp2024-11-1594-51/+1001
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * links in chinese_clip * links for clip model * add mod docs for flux and llava * module doc for MMDIT and MIMI * add docs for a few more modesl * mod docs for bert naser and beit * add module docs for convmixer colpali codegeex and chatglm * add another series of moddocs * add fastvit-llama2_c * module docs mamba -> mobileone * module docs from moondream-phi3 * mod docs for quantized and qwen * update to yi * fix long names * Update llama2_c.rs * Update llama2_c_weights.rs * Fix the link for mimi + tweaks --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* Add max-all/min-all. (#2616)Laurent Mazare2024-11-141-0/+36
|