| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
| |
* Remove the parameters for the Wuerstchen layer-norm.
* Fixes.
* More fixes (including conv-transpose2d.
* More fixes.
* Again more fixes.
|
|
|
|
|
|
|
|
|
| |
* Add the embed mapper convolutions.
* Add the replication pad layer.
* Use the replication-pad op.
* Tweak a todo.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* DiffNeXt/unet
* Start adding the vae.
* VAE residual block.
* VAE forward pass.
* Add pixel shuffling.
* Actually use pixel shuffling.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
* Extract t5 out of musicgen
* Add main for t5 module
|
|
|
|
|
|
|
| |
* Add weight, bias methods to Conv(1|2)
* Add hidden_size method to Embedding
* Expose hidden_size
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* TinyViT.
* More TinyViT.
* Add more to the tinyvit backbone.
* Proper padding.
* Plus ViT.
* Add the tiniest vit spec.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Start processing images.
* Add LayerNorm2d.
* Properly use LayerNorm2d.
* Tweak eps.
* Use LayerNorm on inputs with a rank different from 3.
* Window partitioning.
* Fix a couple todos.
* More todos.
* Hard-code the einsums.
* More padding support.
* Some sizes tweaks.
* Use the hub to get the weights.
* Use a batch matmul.
* Tweaks.
* More fixes.
* Get some predictions to be generated.
|
|
|
|
|
|
|
|
|
|
|
| |
* More segment-anything.
* Split the model in multiple files.
* Start adding the transformer.
* Add the attention block.
* Move the MLP Block.
|
| |
|
| |
|
|
|
|
|
|
|
| |
* Use an arc in the varbuilder rather than rc.
* Require the backends to be send.
* Request send and sync.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add a custom softmax implementation.
* Add softmaxlastdim to the benchmarks.
* And add a test.
* Support more dtypes.
* Polish the code.
* Use the slow implementation on cuda.
* Add a todo for the cuda kernel.
|
| |
|
|
|
|
|
|
|
|
|
| |
* Musicgen text embeddings.
* Bugfix for layer norm.
* Proper position bias.
* Expose the weights.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
* Add a GRU layer.
* Fix the n gate computation.
|
| |
|
|
|
|
|
| |
* Add a LSTM test.
* Clippy.
|
| |
|
|
|
|
|
| |
* Add a dropout layer.
* Add an actual layer.
|
|
|
|
|
|
|
| |
* Add tanh.
* Use tanh in the lstm block.
* Add a test for tanh forward and backward passes.
|
|
|
|
|
|
|
|
|
| |
* Add the rnn module.
* More LSTM.
* Implement the RNN forward pass.
* More forward pass for LSTM.
|
|
|
|
|
|
|
| |
* Simplify usage of the pool functions.
* Small tweak.
* Attempt at using apply to simplify the convnet definition.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add the dilation parameter.
* Restore the basic optimizer example.
* Dilation support in cudnn.
* Use the dilation parameter in the cpu backend.
* More dilation support.
* No support for dilation in transposed convolutions.
* Add dilation to a test.
* Remove a print.
* Helper function.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Preliminary support for SDXL.
* More SDXL support.
* More SDXL.
* Use the proper clip config.
* Querying for existing tensors.
* More robust test.
|
|
|
|
|
|
|
|
|
| |
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
|
|
|
|
|
|
|
|
|
| |
* EfficientNet.
* Complete the efficientnet implementation.
* Improve group handling.
* Get the efficientnet to work.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add some group parameter to convolutions.
* Avoid some unnecessary groups checks.
* Move the tensor convolution bits.
* Properh handling of groups.
* Bump the crate version.
* And add a changelog.
|
|
|
|
|
|
|
|
|
| |
* Some fixes for yolo-v3.
* Use the running stats for inference in the batch-norm layer.
* Get some proper predictions for yolo.
* Avoid the quadratic insertion.
|
| |
|
|
|
|
|
| |
* Move the var-map struct in a separate file.
* Fix some typos.
|
|
|
|
|
|
|
|
|
| |
* Add BatchNormalization.
* More batch-norm.
* Add some validation of the inputs.
* More validation.
|
|
|
|
|
|
|
| |
* Start adding the module trait.
* Use the module trait.
* Implement module for qmatmul.
|
| |
|
|
|
|
|
|
|
| |
* Add some options to make layer-norm more configurable.
* Add the rms-norm variant.
* Replace the RmsNorm with the shared bits.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Fixes for the stable diffusion example.
* Bugfix.
* Another fix.
* Fix for group-norm.
* More fixes to get SD to work.
|
|
|
|
|
| |
* Skeleton for the avg-pool2d and upsample-nearest2d ops.
* Preliminary conv2d support.
|
|
|
|
|
| |
* Implement group-norm.
* Add some testing for group-norm.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Start adding a stable-diffusion example.
* Proper computation of the causal mask.
* Add the chunk operation.
* Work in progress: port the attention module.
* Add some dummy modules for conv2d and group-norm, get the attention module to compile.
* Re-enable the 2d convolution.
* Add the embeddings module.
* Add the resnet module.
* Add the unet blocks.
* Add the unet.
* And add the variational auto-encoder.
* Use the pad function from utils.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Move the vision datasets to a separate crate.
* Move the batcher bits.
* Update the readme.
* Move the tiny-stories bits.
---------
Co-authored-by: Jane Doe <jane.doe@example.org>
|