summaryrefslogtreecommitdiff
path: root/candle-metal-kernels
Commit message (Expand)AuthorAgeFilesLines
* Metal: Activate bfloat affine and add benchmark (#1543)ivarflakstad2024-01-121-7/+7
* Metal: f16 and bf16 where_cond + benchmark (#1545)ivarflakstad2024-01-121-23/+43
* remove metal version checkBaye Dieng2024-01-111-2/+0
* close ifdefBaye Dieng2024-01-111-1/+1
* feat(bf16): add cast support + tests for cast + bin ops (#1524)Kyle McCarthy2024-01-114-15/+191
* Use __HAVE_BFLOAT__ to check for bfloat support instead of metal version chec...ivarflakstad2024-01-106-6/+6
* Add relu kernel for metal (#1488)Juarez Bochi2024-01-102-2/+10
* Adding bfloat16 support for the cast kernels. (#1520)Nicolas Patry2024-01-041-0/+2
* Metal: support unary abs (#1503)Gonzalo2023-12-302-1/+5
* Metal: more u8/u32 (#1502)Gonzalo2023-12-294-4/+17
* Metal: i64 basic support (#1495)Gonzalo2023-12-296-1/+48
* fix bad pattern matching and function nameBaye Dieng2023-12-292-4/+4
* add urecip op to metal backendBaye Dieng2023-12-282-3/+6
* Bump the crate version to 0.3.3. (#1490)Laurent Mazare2023-12-281-1/+1
* Adding upsample_nearest_2d.Nicolas Patry2023-12-252-0/+104
* Fixing matmul for convolutions.Nicolas Patry2023-12-251-2/+2
* Adding the convolutions (1d + 2d) to candle on metal.Nicolas Patry2023-12-214-74/+260
* Merge pull request #1318 from huggingface/metal4Nicolas Patry2023-12-2015-390/+1810
|\
| * Optimizing decode matmul (Phi at 28tok/s on M3).Nicolas Patry2023-12-201-5/+15
| * Clippy pass.Nicolas Patry2023-12-181-1/+0
| * Missing cast.Nicolas Patry2023-12-181-0/+1
| * Index add.Nicolas Patry2023-12-182-56/+109
| * Scatter add.Nicolas Patry2023-12-182-7/+97
| * Adding gather op.Nicolas Patry2023-12-172-15/+125
| * Adding CMPNicolas Patry2023-12-172-13/+24
| * Finish reduce kernels.Nicolas Patry2023-12-174-15/+227
| * Addressing a lot of comments.Nicolas Patry2023-12-152-11/+16
| * Remove test file.Nicolas Patry2023-12-151-209/+0
| * Renamed all kernel names.Nicolas Patry2023-12-155-36/+36
| * Better error message on older macosNicolas Patry2023-12-151-3/+5
| * Adding a bunch of docs !Nicolas Patry2023-12-151-0/+17
| * More cleanup.Nicolas Patry2023-12-151-15/+16
| * Fixing softmax.Nicolas Patry2023-12-151-4/+7
| * Fix softmax for long sequences (missing barrier).Nicolas Patry2023-12-142-17/+49
| * Fix use resource.Nicolas Patry2023-12-141-0/+40
| * Working with merging encoders and using fences.Nicolas Patry2023-12-142-2/+247
| * Fixing tests + matmul from MFANicolas Patry2023-12-132-12/+108
| * Removed MPSMatrix entirely (buggy).Nicolas Patry2023-12-132-20/+286
| * Lots of updates including some stack of command buffers.nicolas2023-12-124-10/+199
| * Fix gelu for large xJuarez Bochi2023-12-062-5/+29
| * Put back affine strided testsNicolas Patry2023-11-301-12/+15
| * Starting to fix some tests.Nicolas Patry2023-11-3012-250/+470
* | Bump the crate version to 0.3.2. (#1452)Laurent Mazare2023-12-171-1/+1
|/
* Moving tests around.Nicolas Patry2023-11-202-623/+617
* Fixing cos_f16 test.Nicolas Patry2023-11-201-2/+2
* Fix comments.Nicolas Patry2023-11-202-33/+13
* Update candle-metal-kernels/Cargo.tomlNicolas Patry2023-11-201-1/+1
* Cleanup fixed a few ops removed debugging scaffolding.Nicolas Patry2023-11-202-1/+3
* Debugging rope.Nicolas Patry2023-11-201-2/+1
* Fixed matmul (display still broken without casting back to CPU first? )Nicolas Patry2023-11-201-1/+2