| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
* Add gradient test for conv_transpose2d with stride of 2.
* Swap dilation and stride in ConvTranspose2D backpropagation.
Without this, a shape mismatch occurs with a stride of 2 and dilation of 1.
* Add further tests of the ConvTranspose2D gradient.
Values calculated with torch, minor numerical errors adjusted and commented.
|
|
|
|
|
|
|
| |
* add documentation for nackprop
* add backwards for ConvTranspose2D
* add test python code to test
|
|
|
|
|
|
|
| |
* add support for conv transpose 2d and add bench mark for float types
* update bench calculation
* enable testing all conv operations on metal
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* first attempt
* progress
* integrate into metal backend
* finish and get test passing
* add other dtype support
* update transpose1d dtypes supported
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add a specialized kernel for copy2d.
* Move the cat operations.
* Avoid transpositions in cat.
* Bugfix.
* Bugfix for the cuda kernel.
* Add a benchmark.
* Add more testing.
* Test fix.
* Faster kernel.
* Add the missing kernel.
* Tweak the test.
* Add a metal kernel.
* Fix for the metal kernel.
* Get the tests to pass on metal.
* Also use this opportunity to fix the metal kernel for ELU.
* Add some bf16 kernels.
* Clippy fixes.
|
|
|
|
|
| |
* Fast CPU kernel for transposed 1d convolutions.
* Bugfix.
|
|
|
|
|
| |
* Add a currently broken test.
* Bugfix + fix test.
|
|
|
|
|
| |
* Groups support in conv-transpose-1d.
* Remove dangling file.
|
|
|
|
|
|
|
| |
* ConvTranspose1d cuda support.
* Add the conv-transpose1d kernel.
* Remove some unused variables.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add the dilation parameter.
* Restore the basic optimizer example.
* Dilation support in cudnn.
* Use the dilation parameter in the cpu backend.
* More dilation support.
* No support for dilation in transposed convolutions.
* Add dilation to a test.
* Remove a print.
* Helper function.
|
|
|
|
|
|
|
| |
* Cuda kernel for conv-transpose.
* Fix the cuda kernel.
* Fix the tests.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Start adding backprop for conv2d.
* Backprop for conv2d.
* Bugfix + start adding a conv2d test.
* Conv2d backprop testing.
* More conv fixes.
|
|
|
|
|
|
|
|
|
|
|
| |
* Add conv-transpose.
* Return zeros for now.
* Naive CPU implementation.
* Add a conv-transpose test + fix the cpu implementation.
* Add a second test.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add to the cuda example a reproduction of the issue.
* Tweak.
* Add a test using non-square matrixes.
* Fix the conv2d kernel.
* Display the error.
* And tweak the comment.
|
|
|
|
|
|
|
|
|
| |
cuda. (#578)
* Add a test for conv2d with padding.
* Cosmetic changes.
* Bugfix the rand function on the cuda backend.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add some group parameter to convolutions.
* Avoid some unnecessary groups checks.
* Move the tensor convolution bits.
* Properh handling of groups.
* Bump the crate version.
* And add a changelog.
|
|
|
|
|
|
|
|
|
|
|
| |
* Add a naive conv2d cuda kernel.
* Proper conv2d support on the rust side.
* Conv1d testing on gpu.
* Also use the test on gpus.
* Fix the clean-ptx target.
|
| |
|
| |
|
| |
|
|
* Add some conv2d tests.
* Add a simpler conv2d test.
* More conv2d testing + bugfix.
* Add a todo.
|