summaryrefslogtreecommitdiff
path: root/candle-nn/src
Commit message (Expand)AuthorAgeFilesLines
* Support for groups in conv-transpose1d. (#1731)Laurent Mazare2024-02-181-3/+13
* Expose the weights and biases in transposed convolutions. (#1727)Laurent Mazare2024-02-181-0/+16
* Expose more conv1d functions/structs. (#1726)Laurent Mazare2024-02-172-2/+19
* feat: add silu activation function (#1706)OlivierDehaene2024-02-142-4/+3
* Detach the tensors on batch-norm eval. (#1702)Laurent Mazare2024-02-131-2/+12
* Fix clippy lints for 1.76. (#1682)Laurent Mazare2024-02-081-1/+1
* Enhance pickle to retrieve state_dict with a given key (#1671)Dilshod Tadjibaev2024-02-061-1/+1
* Add `VarBuilder::from_backend` (#1670)Daniƫl de Kok2024-02-061-8/+17
* Update the Phi model to use the updated architecture. (#1580)Laurent Mazare2024-01-131-0/+1
* Simplify the one-hot implementation, support arbitrary rank. (#1514)Laurent Mazare2024-01-011-181/+38
* Add one-hot/cold encoding (#1489)Ryan Tate2024-01-012-0/+294
* Do not implement Module for BatchNorm. (#1513)Laurent Mazare2024-01-011-13/+13
* Small tweaks to batch-norm. (#1505)Laurent Mazare2023-12-301-19/+16
* [Breaking] Add training to batchnorm with exponential moving average (#1504)nkoppel2023-12-301-50/+158
* Merge pull request #1318 from huggingface/metal4Nicolas Patry2023-12-201-0/+41
|\
| * Clippy pass.Nicolas Patry2023-12-181-3/+3
| * Addressing a lot of comments.Nicolas Patry2023-12-151-1/+2
| * Remove `unwrap()`.Nicolas Patry2023-12-151-2/+2
| * Renamed all kernel names.Nicolas Patry2023-12-151-3/+3
| * Fixing softmax.Nicolas Patry2023-12-151-1/+1
| * Working with merging encoders and using fences.Nicolas Patry2023-12-141-2/+0
| * Lots of updates including some stack of command buffers.nicolas2023-12-121-1/+3
| * Starting to fix some tests.Nicolas Patry2023-11-301-0/+40
* | Fix a couple typos (#1451)Laurent Mazare2023-12-171-1/+1
* | Expose AdamW parameters (#1449)Dave Lage2023-12-161-0/+8
* | Speedup ShardedSafeTensors to load Tensors with default hints (#1384)YiiSh2023-12-141-1/+7
* | Another prelu bugfix. (#1407)Laurent Mazare2023-12-061-1/+1
* | Use the proper broadcasting for prelu. (#1406)Laurent Mazare2023-12-051-5/+16
* | Add the prelu layer. (#1402)Laurent Mazare2023-12-033-4/+51
|/
* Add support to UL2 model family (#1300)Juarez Bochi2023-11-091-1/+0
* Add weight and bias functions to LayerNorm (#1306)jwnz2023-11-091-0/+8
* Transposed conv1d in candle-nn. (#1252)Laurent Mazare2023-11-031-0/+94
* Add the swiglu activation from the chatglm PR. (#1246)Laurent Mazare2023-11-022-0/+7
* Add hard-sigmoid and hard-swish activations (#1244)jamjamjon2023-11-022-0/+9
* Add support for the marian base model. (#1221)Laurent Mazare2023-10-301-0/+2
* Allow for different behavior between training and eval (#1213)Laurent Mazare2023-10-293-2/+43
* Add the relu2 and relu6 activations. (#1201)Laurent Mazare2023-10-271-0/+4
* Add fuse-conv-bn method for Conv2d (#1196)jamjamjon2023-10-272-0/+25
* Expose the fields from batch-norm. (#1176)Laurent Mazare2023-10-251-2/+12
* Add Binary Cross Entropy With Logit Loss to nn crate (#1157)Ogundepo Odunayo2023-10-231-0/+22
* Make func cloneable. (#1137)Laurent Mazare2023-10-202-6/+8
* Add the sequential layer. (#1136)Laurent Mazare2023-10-202-0/+64
* Experiment with resnet (#1128)Laurent Mazare2023-10-191-0/+9
* feat: add pth varbuilder (#1108)OlivierDehaene2023-10-161-0/+41
* Only optimize float tensors. (#1069)Laurent Mazare2023-10-101-0/+5
* More general seq forward functions for RNNs. (#1050)Laurent Mazare2023-10-071-27/+25
* Use AsRef<str> for set_one. (#1033)Laurent Mazare2023-10-051-1/+1
* Bump the version to 0.3.0. (#1014)Laurent Mazare2023-10-011-20/+0
* Use a silu activation in mistral. (#991)Laurent Mazare2023-09-291-0/+4
* Use the gelu-erf activation. (#969)Laurent Mazare2023-09-261-3/+1