summaryrefslogtreecommitdiff
path: root/candle-core
Commit message (Expand)AuthorAgeFilesLines
* Add the pow operator. (#1583)Laurent Mazare2024-01-132-3/+25
* Fix format. (#1576)Nicolas Patry2024-01-121-1/+5
* Metal: Activate bfloat affine and add benchmark (#1543)ivarflakstad2024-01-124-1/+47
* Metal: f16 and bf16 where_cond + benchmark (#1545)ivarflakstad2024-01-124-1/+67
* Bugfix for dequantizing q5k layers. (#1569)Laurent Mazare2024-01-112-5/+5
* feat(bf16): add cast support + tests for cast + bin ops (#1524)Kyle McCarthy2024-01-112-3/+52
* Seperate benchmarks by enabled features (#1538)ivarflakstad2024-01-114-13/+82
* Add a dequantize command to tensor-tools. (#1565)Laurent Mazare2024-01-111-1/+24
* Add relu kernel for metal (#1488)Juarez Bochi2024-01-101-0/+4
* Handle start-offset when loading a tensor from a pickle file. (#1546)Laurent Mazare2024-01-081-3/+11
* Simpler repro for the neon optimization issue + bugfix (#1544)Laurent Mazare2024-01-072-168/+97
* Simplifying our internal cargo dependencies. (#1529)Nicolas Patry2024-01-071-2/+2
* Adding bfloat16 support for the cast kernels. (#1520)Nicolas Patry2024-01-041-0/+4
* Metal: support unary abs (#1503)Gonzalo2023-12-301-0/+4
* Metal: more u8/u32 (#1502)Gonzalo2023-12-291-0/+51
* Metal: i64 basic support (#1495)Gonzalo2023-12-291-0/+35
* Merge pull request #1496 from bayedieng/unaryNicolas Patry2023-12-291-0/+2
|\
| * fix bad pattern matching and function nameBaye Dieng2023-12-291-2/+2
| * add urecip op to metal backendBaye Dieng2023-12-281-0/+2
* | Merge pull request #1491 from mimiquate/metal-errorsNicolas Patry2023-12-291-28/+42
|\ \ | |/ |/|
| * fixes error messageGonzalo2023-12-281-1/+1
| * cargo fmtGonzalo2023-12-281-7/+21
| * Improves metal's not implemented error messagesGonzalo2023-12-281-27/+27
* | Fix lints for clippy 1.75. (#1494)Laurent Mazare2023-12-281-17/+17
|/
* Bump the crate version to 0.3.3. (#1490)Laurent Mazare2023-12-281-2/+2
* Adding upsample_nearest_2d.Nicolas Patry2023-12-251-2/+33
* Merge pull request #1461 from huggingface/metal-convNicolas Patry2023-12-251-10/+137
|\
| * Fixing matmul for convolutions.Nicolas Patry2023-12-251-1/+2
| * Adding the convolutions (1d + 2d) to candle on metal.Nicolas Patry2023-12-211-10/+136
* | Fix the quantized mistral example. (#1478)Laurent Mazare2023-12-251-1/+1
* | Validate the kernel size in pooling ops. (#1473)Laurent Mazare2023-12-231-12/+16
* | Sketch the minimal mamba example. (#1465)Laurent Mazare2023-12-221-1/+0
|/
* Merge pull request #1318 from huggingface/metal4Nicolas Patry2023-12-205-385/+917
|\
| * Optimizing decode matmul (Phi at 28tok/s on M3).Nicolas Patry2023-12-202-0/+50
| * Clippy pass.Nicolas Patry2023-12-181-10/+8
| * Remove print.Nicolas Patry2023-12-181-1/+0
| * Missing cast.Nicolas Patry2023-12-181-0/+2
| * Index add.Nicolas Patry2023-12-181-7/+42
| * Scatter add.Nicolas Patry2023-12-181-10/+50
| * Adding gather op.Nicolas Patry2023-12-171-2/+32
| * Adding CMPNicolas Patry2023-12-171-72/+116
| * Implement randn (CPU-> device)Nicolas Patry2023-12-171-4/+3
| * Finish reduce kernels.Nicolas Patry2023-12-172-24/+31
| * Addressing a lot of comments.Nicolas Patry2023-12-151-8/+15
| * Remove `unwrap()`.Nicolas Patry2023-12-151-46/+75
| * Renamed all kernel names.Nicolas Patry2023-12-151-17/+17
| * More cleanup.Nicolas Patry2023-12-152-6/+1
| * Adding a bunch of docs !Nicolas Patry2023-12-151-53/+105
| * cleanup.Nicolas Patry2023-12-151-27/+4
| * Fixing softmax.Nicolas Patry2023-12-151-4/+6