summaryrefslogtreecommitdiff
Commit message (Expand)AuthorAgeFilesLines
* Sync upstream MLX sdpa vector kernels with mask (#2718)HEADmainEric Buehler2025-01-163-49/+486
* Bump the ug dependency. (#2720)Laurent Mazare2025-01-162-4/+4
* Fix the helium weights download. (#2717)Laurent Mazare2025-01-131-1/+1
* Helium repo update. (#2716)Laurent Mazare2025-01-132-2/+8
* Add the helium model. (#2715)Laurent Mazare2025-01-134-0/+699
* Fixes for running Phi-4 quantized. (#2714)Jani Monoses2025-01-132-2/+6
* ModernBERT model (#2713)Jani Monoses2025-01-136-1/+612
* Clippy fixes for 1.84. (#2710)Laurent Mazare2025-01-102-6/+3
* Update cudarc. (#2708)Laurent Mazare2025-01-081-1/+1
* Bump the caret version to 0.8.2. (#2703)Laurent Mazare2025-01-075-16/+16
* add link to README (#2701)Andrei Fajardo2025-01-041-0/+1
* Fix mistral attention on Metal (#2699)Luka Zakrajšek2025-01-041-1/+2
* UniPC for diffusion sampling (#2684)Nick Senger2025-01-016-5/+1011
* Update the hf-hub dependency to 0.4.0. (#2691)Laurent Mazare2024-12-312-5/+5
* Actually remove the default hf-hub cache path for glm. (#2696)Laurent Mazare2024-12-311-1/+1
* Use the default hf-hub cache for glm. (#2695)Laurent Mazare2024-12-311-7/+10
* Flash-Attn upgrade / SoftCap Candle-FlashAttn [3/n] (#2690)Michael Feil2024-12-313-4/+7
* Flash-Attn upgrade / SoftCap Candle-FlashAttn [2/n] (#2689)Michael Feil2024-12-314-3/+182
* Flash-Attn upgrade / SoftCap Candle-FlashAttn [1/n] (#2688)Michael Feil2024-12-3141-82/+139
* Streamline the glm4 example. (#2694)Laurent Mazare2024-12-313-147/+99
* Fix a cuda warning. (#2693)Laurent Mazare2024-12-311-39/+44
* Update README.org (#2670)jetsung2024-12-301-1/+1
* Added XLMRobertaModel for Reranking (#2686)Akshay Ballal2024-12-304-0/+853
* Fix bug in whisper transformer (#2681)mert-kurttutan2024-12-241-0/+1
* Fix Batcher iterator break when return_last_incomplete_batch and items.is_emp...hhllhhyyds2024-12-241-4/+4
* Fix position encodings for Pixtral (#2678)Amélie Royer2024-12-231-13/+55
* Add a Context trait similar to anyhow::Context. (#2676)Laurent Mazare2024-12-2213-41/+97
* make DepthAnythingV2 more reusable (#2675)Edgar Riba2024-12-212-23/+27
* Bump the crate version to 0.8.1. (#2662)Laurent Mazare2024-12-075-16/+16
* Change/bert encoder public (#2658)Justin Sing2024-12-041-21/+30
* Add Nvembed v2 model (#2649)cdoko2024-12-036-0/+803
* add scatter add (#2656)zachcp2024-12-012-0/+2
* add u32 - U32 gather (#2653)zachcp2024-11-302-79/+81
* Clippy fixes for the cuda feature. (#2650)Laurent Mazare2024-11-292-11/+11
* Adds support for stella_en_v5 embedding model -400M variant (#2608)iskng2024-11-293-112/+555
* Lint fixes introduced with Rust 1.83 (#2646)Anubhab Bandyopadhyay2024-11-2819-55/+57
* Fix for whisper-microphone example failure if audio isn't chunk aligned (#2645)Adam Nelson2024-11-271-3/+17
* Onnx Support for Sign operation #2641 (#2642)Ionut Mihalcea2024-11-262-0/+47
* Provide a method to allow PTH files with state maps to be loaded. (#2639)zachcp2024-11-261-1/+11
* fix typo (#2606)Andrei Fajardo2024-11-231-1/+1
* Tweak the CI to avoid running out of disk space. (#2630)Laurent Mazare2024-11-191-0/+3
* 20241118 docs (#2629)zachcp2024-11-1927-12/+72
* Import the ggml_cuda_dp4a function. (#2628)Laurent Mazare2024-11-191-33/+44
* Fix for clippy. (#2626)Laurent Mazare2024-11-181-1/+1
* Module Docs (#2624)zachcp2024-11-1839-115/+170
* More Model Module Docs (#2623)zachcp2024-11-1712-72/+291
* Module Docs (#2620)zachcp2024-11-165-10/+126
* Remove some unused macros. (#2618)Laurent Mazare2024-11-159-14/+13
* Documentation Pass for Models (#2617)zachcp2024-11-1594-51/+1001
* Add max-all/min-all. (#2616)Laurent Mazare2024-11-141-0/+36