index
:
forks/candle.git
main
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
candle-core
Commit message (
Expand
)
Author
Age
Files
Lines
...
*
Automatically upcast for to_u64 (#2244)
Eric Buehler
2024-06-04
1
-1
/
+7
*
add where_cond f32 for metal (#2236)
Lionel Touati
2024-06-02
1
-0
/
+1
*
Add a metal kernel for col2im1d. (#2214)
Laurent Mazare
2024-05-25
1
-34
/
+92
*
Add the layernorm specialized op. (#2212)
Laurent Mazare
2024-05-24
2
-1
/
+39
*
More efficient cuda implementation for ConvTranspose1d. (#2211)
Laurent Mazare
2024-05-24
2
-4
/
+75
*
Add a slice_set op. (#2193)
Laurent Mazare
2024-05-18
2
-0
/
+87
*
Add SliceSafetensors. (#2179)
Laurent Mazare
2024-05-11
2
-0
/
+71
*
Make it possible to use TF32 accumulation in F32 matmuls. (#2178)
Laurent Mazare
2024-05-11
3
-30
/
+89
*
Use write rather than try-write on the metal rw-locks. (#2162)
Laurent Mazare
2024-05-05
2
-7
/
+13
*
Separate quantized phi-3 implementation. (#2157)
Laurent Mazare
2024-05-04
2
-4
/
+1
*
Bump the version number to 0.5.1. (#2155)
Laurent Mazare
2024-05-03
3
-39
/
+2
*
F16/BF16 bugfix (bis). (#2143)
Laurent Mazare
2024-04-29
1
-14
/
+36
*
Bugfix the recent f16/bf16 changes. (#2142)
Laurent Mazare
2024-04-29
1
-8
/
+8
*
Bug Fix: When converting a tensor to a variable, clone if the tensor is alrea...
Jeffrey Dallatezza
2024-04-29
1
-2
/
+7
*
Fix sigmoid gradient calculation and move sigmoid into a specialized op (#2114)
MilkFather
2024-04-29
1
-2
/
+2
*
Add a toggle for F16/BF16 accumulation in gemm. (#2141)
Laurent Mazare
2024-04-29
3
-15
/
+150
*
Add a forward_via_f16 method to the qmatmul op. (#2138)
Laurent Mazare
2024-04-28
1
-0
/
+19
*
Add the cuda dequantize f16 kernels. (#2137)
Laurent Mazare
2024-04-28
4
-18
/
+242
*
Add a sort function. (#2134)
Laurent Mazare
2024-04-28
2
-0
/
+35
*
Add argsort. (#2132)
Laurent Mazare
2024-04-27
4
-1
/
+241
*
Add StorageRef. (#2113)
Laurent Mazare
2024-04-23
10
-5
/
+108
*
Update zip requirement from 0.6.6 to 1.1.1 (#2103)
dependabot[bot]
2024-04-22
1
-1
/
+1
*
Metal Unary: Add benchmarks and process kernels in a tile based fashion (#2056)
Thomas Santerre
2024-04-21
4
-147
/
+283
*
Small cleanups to the llama multi-process example. (#2098)
Laurent Mazare
2024-04-20
1
-1
/
+5
*
Handle multiple dimensions in metal QMM + two fixes. (#2097)
Laurent Mazare
2024-04-20
1
-15
/
+20
*
Fix the silu gradient issue on 0. (#2083)
Laurent Mazare
2024-04-18
1
-1
/
+1
*
Add more QMMV cuda kernels. (#2077)
Laurent Mazare
2024-04-18
2
-15
/
+25
*
Add the mmv kernels for small batch sizes. (#2075)
Laurent Mazare
2024-04-16
2
-19
/
+81
*
Fix for the batch dim in the quantized matmul example. (#2073)
Laurent Mazare
2024-04-15
3
-38
/
+38
*
Add a function to clear the KV cache in falcon. (#2066)
Laurent Mazare
2024-04-15
1
-0
/
+1
*
Handle zero dims in some simple operations. (#2064)
Laurent Mazare
2024-04-15
2
-0
/
+43
*
Faster kernels for quantized matmul on cuda (#2060)
Laurent Mazare
2024-04-15
1
-6
/
+137
*
Expose the synchronize function on the generic device. (#2062)
Laurent Mazare
2024-04-14
1
-0
/
+8
*
Add missing bfloat unary strided kernels and fix typo (#2058)
ivarflakstad
2024-04-14
1
-0
/
+20
*
Add a synchronize method to devices. (#2055)
Laurent Mazare
2024-04-14
6
-0
/
+24
*
Add benchmarks for qmatmul operations (#2048)
Thomas Santerre
2024-04-13
3
-0
/
+74
*
Support gather on bf16 for metal. (#2035)
Laurent Mazare
2024-04-10
1
-0
/
+1
*
Use BufferOffset in metal backend ops. (#2029)
Laurent Mazare
2024-04-08
1
-50
/
+39
*
Rework the buffer offset logic for metal kernels (#2028)
Laurent Mazare
2024-04-07
1
-39
/
+43
*
Handle the batch dimension in quantized MMV on metal. (#2022)
Laurent Mazare
2024-04-06
1
-1
/
+4
*
first commit (#2018)
Jorge António
2024-04-05
1
-1
/
+1
*
Add support for "sign" on tensors (#2012)
Thomas Santerre
2024-04-04
5
-10
/
+57
*
Fix the matmul layout for accelerate & mkl. (#2011)
Laurent Mazare
2024-04-04
3
-26
/
+8
*
update dtypes checks for several metal operations (#2010)
Thomas Santerre
2024-04-04
1
-27
/
+45
*
Optimize the gelu f16 opt. (#2008)
Laurent Mazare
2024-04-04
2
-8
/
+19
*
Split the cuda error file. (#2003)
Laurent Mazare
2024-04-04
2
-65
/
+67
*
Relax the contiguous check for cuda kernels. (#2000)
Laurent Mazare
2024-04-03
1
-1
/
+6
*
Improve the handling of matmul with squeezed layouts. (#1998)
Laurent Mazare
2024-04-02
4
-138
/
+150
*
modify access for conv and op to be pub to allow external packages to have cu...
Thomas Santerre
2024-04-01
1
-2
/
+2
*
Quantized cuda tweaks. (#1981)
Laurent Mazare
2024-04-01
1
-89
/
+62
[prev]
[next]