index
:
forks/candle.git
main
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
candle-kernels
Commit message (
Expand
)
Author
Age
Files
Lines
*
Add a cuda kernel for upsampling. (#441)
Laurent Mazare
2023-08-14
1
-0
/
+62
*
Add a cuda kernel for avg-pool2d. (#440)
Laurent Mazare
2023-08-14
1
-3
/
+157
*
Add a naive conv2d cuda kernel. (#438)
Laurent Mazare
2023-08-14
1
-8
/
+93
*
Compat windows.
Nicolas Patry
2023-08-10
1
-0
/
+9
*
This is duplicated code on Cuda 12.2.
Nicolas Patry
2023-08-10
1
-18
/
+0
*
Add the license files. (#335)
Laurent Mazare
2023-08-07
1
-1
/
+1
*
Add the recip op + use it in stable-diffusion. (#331)
Laurent Mazare
2023-08-06
1
-0
/
+4
*
Update the repo location. (#305)
Laurent Mazare
2023-08-02
1
-1
/
+1
*
Remove the embedding ops in favor of index-select. (#299)
Laurent Mazare
2023-08-02
1
-40
/
+0
*
Cuda support for the mnist training. (#277)
Laurent Mazare
2023-07-29
2
-7
/
+118
*
Support for where-cond on cuda for u8 and u32. (#274)
Laurent Mazare
2023-07-29
1
-8
/
+15
*
Add some flash attn test (#253)
Laurent Mazare
2023-07-26
1
-2
/
+2
*
Add a test for scatter add. (#238)
Laurent Mazare
2023-07-25
1
-5
/
+3
*
Cuda kernels for IndexAdd/ScatterAdd. (#236)
Laurent Mazare
2023-07-24
2
-1
/
+102
*
Indexing cuda (#235)
Laurent Mazare
2023-07-24
1
-8
/
+119
*
Add some cmp tests. (#233)
Laurent Mazare
2023-07-24
2
-10
/
+56
*
Cleanup some todos. (#226)
Laurent Mazare
2023-07-23
1
-109
/
+83
*
Revert "Add the layer norm files. (#222)" (#223)
Laurent Mazare
2023-07-22
9
-1532
/
+0
*
Add the layer norm files. (#222)
Laurent Mazare
2023-07-22
9
-0
/
+1532
*
Cuda kernels for fast min/max reductions (#203)
Laurent Mazare
2023-07-19
2
-9
/
+117
*
Add the elu cuda kernel. (#114)
Laurent Mazare
2023-07-10
1
-0
/
+38
*
Make it easier to use whisper samples from the repo. (#112)
Laurent Mazare
2023-07-08
1
-12
/
+12
*
Cuda kernel for the conv1d op (#111)
Laurent Mazare
2023-07-08
2
-0
/
+75
*
Sketch a fast cuda kernel for reduce-sum. (#109)
Laurent Mazare
2023-07-08
1
-0
/
+67
*
Tweak the include order to include math.h first. (#100)
Laurent Mazare
2023-07-07
1
-1
/
+1
*
Include the math.h file to get access to constants. (#99)
Laurent Mazare
2023-07-07
1
-0
/
+2
*
Fixing the cached build.
Nicolas Patry
2023-07-05
1
-113
/
+97
*
Minor tweaks.
laurent
2023-07-03
1
-0
/
+3
*
Bugfix: remove the u8/bf16 conversion kernel as it is ambiguous.
laurent
2023-06-30
1
-1
/
+1
*
Add the kernels.
laurent
2023-06-30
5
-0
/
+19
*
Avoid some cast kernels.
laurent
2023-06-29
1
-2
/
+2
*
Add the bf16 cuda kernels.
laurent
2023-06-29
9
-1
/
+67
*
Rerun on new files.
Nicolas Patry
2023-06-29
1
-0
/
+1
*
Fixing kernel cache (a bit brutal for now, but if build triggers,
Nicolas Patry
2023-06-29
1
-0
/
+8
*
Add the relu op.
laurent
2023-06-28
1
-4
/
+13
*
Fix two cuda bugs (matmul and where_cond).
laurent
2023-06-27
1
-1
/
+1
*
Refactor the hierarchy.
Nicolas Patry
2023-06-27
15
-0
/
+963