summaryrefslogtreecommitdiff
path: root/candle-metal-kernels
Commit message (Expand)AuthorAgeFilesLines
...
* More flexible matmul contiguity checks. (#1949)Laurent Mazare2024-03-271-4/+8
* Extend supported dtypes for metal (im2col & upsample_2d) (#1938)Thomas Santerre2024-03-261-0/+8
* Contiguous variant of the rope kernel. (#1929)Laurent Mazare2024-03-252-5/+73
* Fast kernels for rotary embeddings. (#1928)Laurent Mazare2024-03-242-0/+64
* Add support for strided index-select on Metal (#1909)Thomas Santerre2024-03-223-15/+119
* Add support for conv_transpose2d on Metal backend (#1903)Thomas Santerre2024-03-212-0/+144
* RmsNorm kernel for metal. (#1895)Laurent Mazare2024-03-212-0/+114
* Add support for conv_transpose1d for metal backend (#1874)Thomas Santerre2024-03-193-0/+347
* Add avg_pool2d metal implementation for the metal backend (#1869)Thomas Santerre2024-03-183-13/+194
* Add support for max_pool2d for Metal backend (#1863)Thomas Santerre2024-03-183-1/+353
* add test for index add and add missing match statements (#1862)Thomas Santerre2024-03-172-15/+139
* add support for casting between all datatypes (#1860)Thomas Santerre2024-03-172-96/+211
* Optimize the cat operation on contiguous tensors (#1855)Laurent Mazare2024-03-173-1/+78
* Add support for index u8/i64 and input f16/bf16 scatter-add on metal (#1849)Thomas Santerre2024-03-172-2/+115
* Bump the crate versions to 0.4.2. (#1821)Laurent Mazare2024-03-081-1/+1
* Metal random-generation bug fixes (#1811)Niklas Hallqvist2024-03-082-12/+24
* Bump the version number to 0.4.1. (#1768)Laurent Mazare2024-02-271-1/+1
* feat: add silu activation function (#1706)OlivierDehaene2024-02-143-1/+25
* Bump the crate version to 0.4.0. (#1658)Laurent Mazare2024-02-041-1/+1
* Merge pull request #1606 from FL33TW00D/feature/larger-batchesChristopher Fleetwood2024-01-292-7/+6
|\
| * chore: finalFL33TW00D2024-01-222-15/+10
| * chore: actual fixFL33TW00D2024-01-192-2/+3
| * chore: switch to bufferFL33TW00D2024-01-192-10/+14
| * fix: larger batchesFL33TW00D2024-01-182-7/+6
* | Merge pull request #1533 from huggingface/ivarflakstad/metal-prngivarflakstad2024-01-223-4/+402
|\ \ | |/ |/|
| * Revert public EncoderParamIvar Flakstad2024-01-171-1/+1
| * Merge branch 'main' into ivarflakstad/metal-prngIvar Flakstad2024-01-174-84/+5300
| |\
| * | Update metal random kernel and set_seed methodIvar Flakstad2024-01-171-8/+10
| * | Seed should be updated by random kernel result.Ivar Flakstad2024-01-153-20/+48
| * | Merge branch 'main' into ivarflakstad/metal-prngIvar Flakstad2024-01-142-30/+50
| |\ \
| * | | fmtIvar Flakstad2024-01-121-9/+29
| * | | Merge branch 'main' into ivarflakstad/metal-prngIvar Flakstad2024-01-129-24/+206
| |\ \ \
| * \ \ \ Merge branch 'main' into ivarflakstad/metal-prngIvar Flakstad2024-01-076-8/+77
| |\ \ \ \
| * | | | | Gaussian normal distribution of PRNG via Box-Muller transformIvar Flakstad2024-01-073-86/+178
| * | | | | Implement hybrid Tausworthe + LCG psuedo random number generator in metalIvar Flakstad2024-01-053-4/+264
* | | | | | Merge pull request #1602 from mimiquate/fix-metal-kernel-typeivarflakstad2024-01-181-1/+1
|\ \ \ \ \ \ | |_|_|_|_|/ |/| | | | |
| * | | | | Fixes metal kernel u8 typeGonzalo2024-01-171-1/+1
| | |_|_|/ | |/| | |
* / | | | Quantized GGUF style (#1523)Nicolas Patry2024-01-174-75/+5295
|/ / / /
* | | | Metal: Activate bfloat affine and add benchmark (#1543)ivarflakstad2024-01-121-7/+7
* | | | Metal: f16 and bf16 where_cond + benchmark (#1545)ivarflakstad2024-01-121-23/+43
| |_|/ |/| |
* | | remove metal version checkBaye Dieng2024-01-111-2/+0
* | | close ifdefBaye Dieng2024-01-111-1/+1
* | | feat(bf16): add cast support + tests for cast + bin ops (#1524)Kyle McCarthy2024-01-114-15/+191
* | | Use __HAVE_BFLOAT__ to check for bfloat support instead of metal version chec...ivarflakstad2024-01-106-6/+6
* | | Add relu kernel for metal (#1488)Juarez Bochi2024-01-102-2/+10
| |/ |/|
* | Adding bfloat16 support for the cast kernels. (#1520)Nicolas Patry2024-01-041-0/+2
* | Metal: support unary abs (#1503)Gonzalo2023-12-302-1/+5
* | Metal: more u8/u32 (#1502)Gonzalo2023-12-294-4/+17
* | Metal: i64 basic support (#1495)Gonzalo2023-12-296-1/+48
* | fix bad pattern matching and function nameBaye Dieng2023-12-292-4/+4