| Commit message (Expand) | Author | Age | Files | Lines |
* | Add the AdamW optimizer. (#307) | Laurent Mazare | 2023-08-02 | 5 | -18/+215 |
* | Update the repo location. (#305) | Laurent Mazare | 2023-08-02 | 1 | -8/+7 |
* | Add some missing readme files. (#304) | Laurent Mazare | 2023-08-02 | 1 | -0/+1 |
* | Add version numbers for all the candle crates (#303) | Laurent Mazare | 2023-08-02 | 1 | -1/+1 |
* | Rename the candle crate to candle-core (#301) | Laurent Mazare | 2023-08-02 | 1 | -1/+1 |
* | Use index-select for the embeddings as it supports backprop. (#298) | Laurent Mazare | 2023-08-01 | 1 | -1/+1 |
* | Llama more training (#297) | Laurent Mazare | 2023-08-01 | 7 | -21/+197 |
* | Add some batcher variants that handle errors. (#294) | Laurent Mazare | 2023-08-01 | 1 | -0/+75 |
* | Add the batcher. (#293) | Laurent Mazare | 2023-08-01 | 2 | -0/+97 |
* | Add the cross-entropy loss. (#287) | Laurent Mazare | 2023-07-31 | 2 | -1/+21 |
* | Make the nll op closer to the pytorch version + add a test. (#286) | Laurent Mazare | 2023-07-31 | 2 | -2/+53 |
* | Improve the mnist training example. (#276) | Laurent Mazare | 2023-07-29 | 2 | -4/+37 |
* | More mnist training. (#275) | Laurent Mazare | 2023-07-29 | 1 | -0/+1 |
* | Softmax numerical stability. (#267) | Laurent Mazare | 2023-07-28 | 2 | -0/+86 |
* | Added comment about offsets. | Nicolas Patry | 2023-07-27 | 1 | -0/+3 |
* | Fixing slice errors + comments. | Nicolas Patry | 2023-07-27 | 1 | -3/+22 |
* | Removing inner dependency on safetensors. | Nicolas Patry | 2023-07-27 | 1 | -4/+6 |
* | TP sharding v2 | Nicolas Patry | 2023-07-27 | 2 | -5/+54 |
* | Move some shared functions to the nn module. (#221) | Laurent Mazare | 2023-07-22 | 3 | -0/+20 |
* | Rename the .r functions to .dims so as to be a bit more explicit. (#220) | Laurent Mazare | 2023-07-22 | 2 | -2/+2 |
* | [Proposal] Remove SafeTensor wrapper (allows finer control for users). | Nicolas Patry | 2023-07-19 | 1 | -2/+6 |
* | Vision dataset (#179) | Laurent Mazare | 2023-07-16 | 4 | -0/+140 |
* | Centralize the dependency versions and inherit them. (#177) | Laurent Mazare | 2023-07-16 | 1 | -4/+4 |
* | Removing cuda default. | Nicolas Patry | 2023-07-14 | 1 | -1/+1 |
* | Add backtrace information to errors where relevant. (#166) | Laurent Mazare | 2023-07-14 | 1 | -7/+17 |
* | Simplify the parameters used by sum and sum_keepdim. (#165) | Laurent Mazare | 2023-07-14 | 2 | -4/+4 |
* | Use the same default as pytorch for sum. (#164) | Laurent Mazare | 2023-07-13 | 2 | -4/+4 |
* | Add the pytorch version of the linear regression as a comment. (#163) | Laurent Mazare | 2023-07-13 | 1 | -0/+24 |
* | Add the gradient for reduce-sum. (#162) | Laurent Mazare | 2023-07-13 | 2 | -3/+27 |
* | Add the SGD optimizer (#160) | Laurent Mazare | 2023-07-13 | 3 | -0/+68 |
* | Add some documentation and test to the linear layer. (#151) | Laurent Mazare | 2023-07-12 | 4 | -0/+51 |
* | Cleanup the main crate error and add a couple dedicated ones (#142) | Laurent Mazare | 2023-07-12 | 1 | -2/+3 |
* | Allow for lazy loading of npz files, use it in llama to reduce memory usage i... | Laurent Mazare | 2023-07-11 | 1 | -2/+27 |
* | Resurrect the llama npy support. (#140) | Laurent Mazare | 2023-07-11 | 1 | -28/+55 |
* | Sketch the tensor initialization module. (#134) | Laurent Mazare | 2023-07-11 | 2 | -6/+116 |
* | VarBuilder path creation (#131) | Laurent Mazare | 2023-07-10 | 1 | -19/+84 |
* | Move the var-builder in a central place. (#130) | Laurent Mazare | 2023-07-10 | 2 | -0/+61 |
* | Add some layer-norm tests. (#121) | Laurent Mazare | 2023-07-10 | 1 | -0/+43 |
* | Move the conv1d layer to candle_nn. (#117) | Laurent Mazare | 2023-07-10 | 2 | -0/+51 |
* | [nn] Move the Embedding and Activation parts. (#116) | Laurent Mazare | 2023-07-10 | 3 | -0/+53 |
* | Sketch the candle-nn crate. (#115) | Laurent Mazare | 2023-07-10 | 4 | -0/+88 |