summaryrefslogtreecommitdiff
path: root/candle-core
Commit message (Expand)AuthorAgeFilesLines
* Really unique identifier for metal device ids. (#1932)Laurent Mazare2024-03-252-9/+37
* Preliminary support for inplace ops. (#1921)Laurent Mazare2024-03-235-2/+215
* Backwards for ConvTranspose2D (#1910)Kirpal Grewal2024-03-232-10/+189
* Add support for strided index-select on Metal (#1909)Thomas Santerre2024-03-221-8/+10
* Add the alloc_uninit function. (#1901)Laurent Mazare2024-03-229-16/+154
* Add support for conv_transpose2d on Metal backend (#1903)Thomas Santerre2024-03-215-76/+177
* Async tensor copying. (#1900)Laurent Mazare2024-03-217-4/+59
* Prepare for the custom-op extension. (#1892)Laurent Mazare2024-03-215-247/+256
* Cuda backend optimization (#1886)Laurent Mazare2024-03-201-12/+47
* Minor cleanup. (#1885)Laurent Mazare2024-03-201-4/+0
* Avoid copying the data on squeeze and unsqueeze. (#1884)Laurent Mazare2024-03-202-3/+42
* Add support for conv_transpose1d for metal backend (#1874)Thomas Santerre2024-03-192-10/+47
* Add avg_pool2d metal implementation for the metal backend (#1869)Thomas Santerre2024-03-182-7/+42
* Add support for max_pool2d for Metal backend (#1863)Thomas Santerre2024-03-182-6/+41
* add test for index add and add missing match statements (#1862)Thomas Santerre2024-03-171-1/+21
* add support for casting between all datatypes (#1860)Thomas Santerre2024-03-171-7/+20
* Optimize the cat operation on contiguous tensors (#1855)Laurent Mazare2024-03-1714-206/+618
* Add support for index u8/i64 and input f16/bf16 scatter-add on metal (#1849)Thomas Santerre2024-03-171-0/+8
* Implement the error trait for DTypeParseError. (#1852)Laurent Mazare2024-03-151-2/+10
* Properly handle the batch dimension in cuda quantized matmul. (#1832)Laurent Mazare2024-03-101-1/+1
* Fix dequantization. (#1823)Laurent Mazare2024-03-081-1/+1
* Fast CPU kernel for transposed 1d convolutions. (#1822)Laurent Mazare2024-03-082-21/+99
* Metal random-generation bug fixes (#1811)Niklas Hallqvist2024-03-082-1/+26
* Expose more printer options. (#1817)Laurent Mazare2024-03-081-5/+30
* Expose a couple layout methods. (#1816)Laurent Mazare2024-03-081-3/+3
* Improve metal buffer usage (#1807)ivarflakstad2024-03-072-86/+137
* Add a cuda kernel for dequantizing q8_0. (#1804)Laurent Mazare2024-03-051-4/+0
* Tweaks to run metavoice on metal (#1792)Laurent Mazare2024-03-032-0/+6
* Handle Q5_0 and Q5_1 quants in cuda.laurent2024-02-292-24/+38
* Fix the block size for some cuda kernels. (#1767)Laurent Mazare2024-02-272-45/+15
* Cuda kernel for dequantizing q8k. (#1760)Laurent Mazare2024-02-262-22/+20
* Cuda acceleration for quantized model. (#1754)Laurent Mazare2024-02-258-69/+458
* Support for attention bias in gemma + refactor things a bit. (#1744)Laurent Mazare2024-02-221-46/+28
* Add grads for interpolate1d (#1742)Kirpal Grewal2024-02-224-6/+51
* Add a couple backtraces on cpu errors. (#1738)Laurent Mazare2024-02-201-3/+3
* Bugfix for conv-transpose1d (#1734)Laurent Mazare2024-02-192-0/+15
* Support for groups in conv-transpose1d. (#1731)Laurent Mazare2024-02-183-19/+43
* Fix float unpickling. (#1730)Laurent Mazare2024-02-181-2/+5
* Module implementation for options. (#1728)Laurent Mazare2024-02-181-0/+9
* feat: add silu activation function (#1706)OlivierDehaene2024-02-148-0/+169
* Qmetal tweaks (#1704)Laurent Mazare2024-02-133-100/+141
* Fixing quantized llama demo on metal. (#1703)Nicolas Patry2024-02-133-0/+19
* Detach the tensors on batch-norm eval. (#1702)Laurent Mazare2024-02-133-4/+8
* ConvTranspose1d cuda support. (#1697)Laurent Mazare2024-02-122-16/+66
* Support defaultdict in PyTorch checkpoints. (#1696)Laurent Mazare2024-02-121-2/+4
* Pickle support: dig within the _rebuild_parameter calls. (#1681)Laurent Mazare2024-02-081-0/+7
* Add support for loading Fortran contiguous tensors (#1672)Dilshod Tadjibaev2024-02-074-3/+61
* Enhance pickle to retrieve state_dict with a given key (#1671)Dilshod Tadjibaev2024-02-065-8/+60
* Fix rustfmt. (#1669)Laurent Mazare2024-02-061-1/+1
* Fix clippy lints. (#1667)Laurent Mazare2024-02-061-4/+5