summaryrefslogtreecommitdiff
path: root/candle-core
Commit message (Expand)AuthorAgeFilesLines
* Get the cuda tests to pass.laurent2023-06-281-20/+29
* Adapt the cuda bits.laurent2023-06-285-87/+109
* Fix some cpu issue.laurent2023-06-281-27/+30
* Remove some todos.laurent2023-06-281-2/+0
* Get the cpu tests to run.laurent2023-06-285-19/+6
* Get the cpu backend to compile.laurent2023-06-285-59/+44
* Propagate the changes on the cpu backend.laurent2023-06-282-81/+89
* Propagate the layout refactoring.laurent2023-06-285-129/+130
* Simplify the narrow implementation.laurent2023-06-283-34/+36
* Start refactoring the stride.laurent2023-06-285-108/+124
* Add the grad for narrow.laurent2023-06-281-2/+25
* Add more gradients.laurent2023-06-282-6/+17
* Add the relu op.laurent2023-06-283-5/+34
* Factor out the gemm bits.laurent2023-06-281-180/+74
* Add more cuda testing again.laurent2023-06-282-42/+59
* Also run the backprop tests on cuda.laurent2023-06-283-8/+10
* Add some display tests + bugfixes.laurent2023-06-273-14/+102
* PyTorch like display implementation.laurent2023-06-273-196/+203
* Add squeeze/unsqueeze/stack.laurent2023-06-271-0/+30
* Rework the debug trait.laurent2023-06-273-6/+449
* Add the get method.laurent2023-06-271-0/+9
* Add some helper functions.laurent2023-06-272-6/+26
* Add some test utils module.laurent2023-06-272-41/+54
* Factor the slicing code in cuda.laurent2023-06-271-19/+23
* Run the tensor tests for the cuda backend too.laurent2023-06-272-41/+74
* Use num-cpus to enable parallelism.laurent2023-06-274-4/+17
* Cache the causal mask in llama.laurent2023-06-271-17/+51
* Fix two cuda bugs (matmul and where_cond).laurent2023-06-272-14/+2
* Refactor the hierarchy.Nicolas Patry2023-06-2725-0/+6053