summaryrefslogtreecommitdiff
Commit message (Expand)AuthorAgeFilesLines
...
* Add some missing index-select metal kernels. (#2613)Laurent Mazare2024-11-123-2/+16
* Bump the crate version to 0.8.0. (#2612)Laurent Mazare2024-11-125-16/+16
* Update docs (#2553)zachcp2024-11-1110-0/+48
* Add some fast Metal MLX SDPA kernels (#2584)Eric Buehler2024-11-057-15/+2007
* Improved launch config for layer-norm/rms-norm. (#2591)Laurent Mazare2024-11-043-12/+72
* Add the SmolLM2 models. (#2595)Laurent Mazare2024-11-033-18/+73
* Support Skip Layer Guidance (SLG) for Stable Diffusion 3.5 Medium (#2590)Czxck0012024-11-013-10/+79
* Lazy upcasting for t5. (#2589)Laurent Mazare2024-10-303-34/+59
* Support sd3.5 medium and MMDiT-X (#2587)Czxck0012024-10-304-35/+269
* Reduce memory usage for sd 3.5. (#2582)Laurent Mazare2024-10-281-0/+2
* Release the mmdit model earlier to reduce memory usage. (#2581)Laurent Mazare2024-10-281-16/+17
* UG metal integration. (#2580)Laurent Mazare2024-10-278-18/+92
* Support for UG kernels. (#2579)Laurent Mazare2024-10-278-2/+139
* Stable diffusion 3.5 support. (#2578)Laurent Mazare2024-10-275-85/+209
* Update README.md (#2577)sashaphmn2024-10-261-1/+2
* Fix the repo name for llama 3.1. (#2576)Laurent Mazare2024-10-263-7/+7
* use softmax_last_dim (metal and cuda kernel) in llama attention layer (#2572)Zack Angelo2024-10-231-1/+2
* ONNX: GatherElements, Xor (#2568)Anubhab Bandyopadhyay2024-10-172-0/+582
* Testcases (#2567)Anubhab Bandyopadhyay2024-10-172-3/+278
* onnx: ReduceMin/Max Ops (#2563)Anubhab Bandyopadhyay2024-10-152-1/+1211
* Enable stable-diffusion 3 on metal. (#2560)Laurent Mazare2024-10-144-12/+11
* Adds support for Stella_en_v5 embedding model - 1.5B variant (#2551)Anubhab Bandyopadhyay2024-10-134-0/+804
* fix: Allow marian configs to deserialize from json. (#2556)Mikarific2024-10-131-1/+2
* Fix the guide to gain access to Stable Diffusion 3 Medium (#2559)Czxck0012024-10-131-2/+9
* Add Stable Diffusion 3 Example (#2558)Czxck0012024-10-1316-34/+751
* feat: intergrate chinese clip and add example (#2555)SethWen2024-10-105-0/+1358
* Add BertForMaskedLM to support SPLADE Models (#2550)Akshay Ballal2024-10-073-0/+335
* improve (#2548)Jorge António2024-10-071-0/+1
* Switch to using the MLX matmul by default. (#2547)Laurent Mazare2024-10-061-3/+3
* pyo3 update. (#2545)Laurent Mazare2024-10-065-27/+22
* Tensor tools print all (#2543)Laurent Mazare2024-10-051-0/+29
* Add required feature for whisper example in Readme (#2539)dengelt2024-10-041-1/+1
* Make the RNN configs accessible from the models. (#2541)Laurent Mazare2024-10-043-74/+103
* Fix for cudnn bf16 conv2d. (#2535)Laurent Mazare2024-10-022-10/+14
* Support whisper large-v3 turbo in the whisper-microphone example. (#2533)Laurent Mazare2024-10-021-0/+3
* Add support for cuda streams. (#2532)Laurent Mazare2024-10-023-0/+24
* Add whisper large-v3 turbo to the example. (#2531)Laurent Mazare2024-10-021-0/+3
* Add a seed to the flux example. (#2529)Laurent Mazare2024-10-021-3/+10
* Tweak some metal tests. (#2528)Laurent Mazare2024-10-022-62/+23
* Efficient implementation of `Tensor::ones()` for `metal` (#2512)Anubhab Bandyopadhyay2024-10-015-4/+194
* Cuda quantized mmv bugfix. (#2526)Laurent Mazare2024-10-011-1/+25
* Add ColPali (#2524)Akshay Ballal2024-10-017-1/+394
* Refactor the whisper microphone example. (#2523)Laurent Mazare2024-10-012-82/+74
* Add/lstm direction (#2455)Justin Sing2024-09-301-8/+25
* Yet another cuda qmm padding fix. (#2509)Laurent Mazare2024-09-301-25/+55
* Pixtral polishing. (#2522)Laurent Mazare2024-09-302-12/+29
* Add Pixtral. (#2521)Laurent Mazare2024-09-309-19/+822
* Add PaliGemma. (#2519)Laurent Mazare2024-09-295-0/+434
* Paligemma siglip vision config (#2518)Laurent Mazare2024-09-291-0/+54
* Bump the crate version to 0.7.2. (#2517)Laurent Mazare2024-09-295-16/+16