summaryrefslogtreecommitdiff
path: root/candle-core
Commit message (Expand)AuthorAgeFilesLines
* Fix the block size for some cuda kernels. (#1767)Laurent Mazare2024-02-272-45/+15
* Cuda kernel for dequantizing q8k. (#1760)Laurent Mazare2024-02-262-22/+20
* Cuda acceleration for quantized model. (#1754)Laurent Mazare2024-02-258-69/+458
* Support for attention bias in gemma + refactor things a bit. (#1744)Laurent Mazare2024-02-221-46/+28
* Add grads for interpolate1d (#1742)Kirpal Grewal2024-02-224-6/+51
* Add a couple backtraces on cpu errors. (#1738)Laurent Mazare2024-02-201-3/+3
* Bugfix for conv-transpose1d (#1734)Laurent Mazare2024-02-192-0/+15
* Support for groups in conv-transpose1d. (#1731)Laurent Mazare2024-02-183-19/+43
* Fix float unpickling. (#1730)Laurent Mazare2024-02-181-2/+5
* Module implementation for options. (#1728)Laurent Mazare2024-02-181-0/+9
* feat: add silu activation function (#1706)OlivierDehaene2024-02-148-0/+169
* Qmetal tweaks (#1704)Laurent Mazare2024-02-133-100/+141
* Fixing quantized llama demo on metal. (#1703)Nicolas Patry2024-02-133-0/+19
* Detach the tensors on batch-norm eval. (#1702)Laurent Mazare2024-02-133-4/+8
* ConvTranspose1d cuda support. (#1697)Laurent Mazare2024-02-122-16/+66
* Support defaultdict in PyTorch checkpoints. (#1696)Laurent Mazare2024-02-121-2/+4
* Pickle support: dig within the _rebuild_parameter calls. (#1681)Laurent Mazare2024-02-081-0/+7
* Add support for loading Fortran contiguous tensors (#1672)Dilshod Tadjibaev2024-02-074-3/+61
* Enhance pickle to retrieve state_dict with a given key (#1671)Dilshod Tadjibaev2024-02-065-8/+60
* Fix rustfmt. (#1669)Laurent Mazare2024-02-061-1/+1
* Fix clippy lints. (#1667)Laurent Mazare2024-02-061-4/+5
* Fix: pth files don't load on Windows (#1661)Roma Klapaukh2024-02-064-3/+15
* add roll function to tensor (#1666)Jiayu Liu2024-02-061-0/+28
* Merge branch 'main' into ivarflakstad/metal-prngIvar Flakstad2024-01-178-352/+1024
|\
| * Quantized GGUF style (#1523)Nicolas Patry2024-01-177-351/+1023
| * Expose the ndarray trait. (#1586)Laurent Mazare2024-01-141-1/+1
* | Update metal random kernel and set_seed methodIvar Flakstad2024-01-171-20/+13
* | Seed should be updated by random kernel result.Ivar Flakstad2024-01-151-7/+28
* | Merge branch 'main' into ivarflakstad/metal-prngIvar Flakstad2024-01-147-4/+143
|\|
| * Add the pow operator. (#1583)Laurent Mazare2024-01-132-3/+25
| * Fix format. (#1576)Nicolas Patry2024-01-121-1/+5
| * Metal: Activate bfloat affine and add benchmark (#1543)ivarflakstad2024-01-124-1/+47
| * Metal: f16 and bf16 where_cond + benchmark (#1545)ivarflakstad2024-01-124-1/+67
* | Merge branch 'main' into ivarflakstad/metal-prngIvar Flakstad2024-01-1210-219/+252
|\|
| * Bugfix for dequantizing q5k layers. (#1569)Laurent Mazare2024-01-112-5/+5
| * feat(bf16): add cast support + tests for cast + bin ops (#1524)Kyle McCarthy2024-01-112-3/+52
| * Seperate benchmarks by enabled features (#1538)ivarflakstad2024-01-114-13/+82
| * Add a dequantize command to tensor-tools. (#1565)Laurent Mazare2024-01-111-1/+24
| * Add relu kernel for metal (#1488)Juarez Bochi2024-01-101-0/+4
| * Handle start-offset when loading a tensor from a pickle file. (#1546)Laurent Mazare2024-01-081-3/+11
| * Simpler repro for the neon optimization issue + bugfix (#1544)Laurent Mazare2024-01-072-168/+97
| * Simplifying our internal cargo dependencies. (#1529)Nicolas Patry2024-01-071-2/+2
* | Updated feature separated benchmarksIvar Flakstad2024-01-094-26/+14
* | Merge branch 'ivarflakstad/seperate-benchmarks-by-feature' into ivarflakstad/...Ivar Flakstad2024-01-094-11/+66
|\ \
| * | Improve benchmarks layoutIvar Flakstad2024-01-094-6/+9
| * | Avoid some unnecessary returns.Laurent2024-01-081-4/+4
| * | Remove allow pragmaIvar Flakstad2024-01-082-6/+2
| * | Use cfg to seperate benchmark results based on featuresIvar Flakstad2024-01-072-8/+64
| |/
* | Merge branch 'main' into ivarflakstad/metal-prngIvar Flakstad2024-01-071-34/+144
|\|
| * Adding bfloat16 support for the cast kernels. (#1520)Nicolas Patry2024-01-041-0/+4