index
:
forks/candle.git
main
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
candle-core
/
examples
/
cuda_basics.rs
Commit message (
Expand
)
Author
Age
Files
Lines
*
Fix the fast bf16 gemm cublas kernels. (#2274)
Laurent Mazare
2024-06-18
1
-1
/
+4
*
Make it possible to use TF32 accumulation in F32 matmuls. (#2178)
Laurent Mazare
2024-05-11
1
-24
/
+18
*
Cuda kernel for dequantizing q8k. (#1760)
Laurent Mazare
2024-02-26
1
-4
/
+4
*
Cuda acceleration for quantized model. (#1754)
Laurent Mazare
2024-02-25
1
-16
/
+23
*
Dilated convolutions (#657)
Laurent Mazare
2023-08-29
1
-3
/
+3
*
Add to the cuda example a reproduction of the issue. (#579)
Laurent Mazare
2023-08-24
1
-2
/
+11
*
Add a test for conv2d with padding + bugfix the random number generation on c...
Laurent Mazare
2023-08-24
1
-0
/
+3
*
Add some group parameter to convolutions. (#566)
Laurent Mazare
2023-08-23
1
-1
/
+1
*
Cudnn support (#445)
Laurent Mazare
2023-08-14
1
-5
/
+4
*
More accelerate optimizations (#427)
Laurent Mazare
2023-08-13
1
-0
/
+3
*
Rename the candle crate to candle-core (#301)
Laurent Mazare
2023-08-02
1
-1
/
+1
*
Simplify the parameters used by sum and sum_keepdim. (#165)
Laurent Mazare
2023-07-14
1
-2
/
+2
*
Use the same default as pytorch for sum. (#164)
Laurent Mazare
2023-07-13
1
-2
/
+2
*
Sketch a fast cuda kernel for reduce-sum. (#109)
Laurent Mazare
2023-07-08
1
-0
/
+15
*
Add some very simple sum benchmark. (#108)
Laurent Mazare
2023-07-08
1
-34
/
+0
*
Add mkl support for matrix multiply. (#86)
Laurent Mazare
2023-07-06
1
-0
/
+3
*
Refactor the hierarchy.
Nicolas Patry
2023-06-27
1
-0
/
+31