summaryrefslogtreecommitdiff
path: root/candle-nn
Commit message (Expand)AuthorAgeFilesLines
* Add the AdamW optimizer. (#307)Laurent Mazare2023-08-025-18/+215
* Update the repo location. (#305)Laurent Mazare2023-08-021-8/+7
* Add some missing readme files. (#304)Laurent Mazare2023-08-021-0/+1
* Add version numbers for all the candle crates (#303)Laurent Mazare2023-08-021-1/+1
* Rename the candle crate to candle-core (#301)Laurent Mazare2023-08-021-1/+1
* Use index-select for the embeddings as it supports backprop. (#298)Laurent Mazare2023-08-011-1/+1
* Llama more training (#297)Laurent Mazare2023-08-017-21/+197
* Add some batcher variants that handle errors. (#294)Laurent Mazare2023-08-011-0/+75
* Add the batcher. (#293)Laurent Mazare2023-08-012-0/+97
* Add the cross-entropy loss. (#287)Laurent Mazare2023-07-312-1/+21
* Make the nll op closer to the pytorch version + add a test. (#286)Laurent Mazare2023-07-312-2/+53
* Improve the mnist training example. (#276)Laurent Mazare2023-07-292-4/+37
* More mnist training. (#275)Laurent Mazare2023-07-291-0/+1
* Softmax numerical stability. (#267)Laurent Mazare2023-07-282-0/+86
* Added comment about offsets.Nicolas Patry2023-07-271-0/+3
* Fixing slice errors + comments.Nicolas Patry2023-07-271-3/+22
* Removing inner dependency on safetensors.Nicolas Patry2023-07-271-4/+6
* TP sharding v2Nicolas Patry2023-07-272-5/+54
* Move some shared functions to the nn module. (#221)Laurent Mazare2023-07-223-0/+20
* Rename the .r functions to .dims so as to be a bit more explicit. (#220)Laurent Mazare2023-07-222-2/+2
* [Proposal] Remove SafeTensor wrapper (allows finer control for users).Nicolas Patry2023-07-191-2/+6
* Vision dataset (#179)Laurent Mazare2023-07-164-0/+140
* Centralize the dependency versions and inherit them. (#177)Laurent Mazare2023-07-161-4/+4
* Removing cuda default.Nicolas Patry2023-07-141-1/+1
* Add backtrace information to errors where relevant. (#166)Laurent Mazare2023-07-141-7/+17
* Simplify the parameters used by sum and sum_keepdim. (#165)Laurent Mazare2023-07-142-4/+4
* Use the same default as pytorch for sum. (#164)Laurent Mazare2023-07-132-4/+4
* Add the pytorch version of the linear regression as a comment. (#163)Laurent Mazare2023-07-131-0/+24
* Add the gradient for reduce-sum. (#162)Laurent Mazare2023-07-132-3/+27
* Add the SGD optimizer (#160)Laurent Mazare2023-07-133-0/+68
* Add some documentation and test to the linear layer. (#151)Laurent Mazare2023-07-124-0/+51
* Cleanup the main crate error and add a couple dedicated ones (#142)Laurent Mazare2023-07-121-2/+3
* Allow for lazy loading of npz files, use it in llama to reduce memory usage i...Laurent Mazare2023-07-111-2/+27
* Resurrect the llama npy support. (#140)Laurent Mazare2023-07-111-28/+55
* Sketch the tensor initialization module. (#134)Laurent Mazare2023-07-112-6/+116
* VarBuilder path creation (#131)Laurent Mazare2023-07-101-19/+84
* Move the var-builder in a central place. (#130)Laurent Mazare2023-07-102-0/+61
* Add some layer-norm tests. (#121)Laurent Mazare2023-07-101-0/+43
* Move the conv1d layer to candle_nn. (#117)Laurent Mazare2023-07-102-0/+51
* [nn] Move the Embedding and Activation parts. (#116)Laurent Mazare2023-07-103-0/+53
* Sketch the candle-nn crate. (#115)Laurent Mazare2023-07-104-0/+88