Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Bump the caret version to 0.8.2. (#2703) | Laurent Mazare | 2025-01-07 | 1 | -2/+2 |
| | |||||
* | Flash-Attn upgrade / SoftCap Candle-FlashAttn [3/n] (#2690) | Michael Feil | 2024-12-31 | 3 | -4/+7 |
| | | | | | | | | | | | | | | | * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace * softcap is working, including test and api. * make softcap test case better * unpadded lse added | ||||
* | Flash-Attn upgrade / SoftCap Candle-FlashAttn [2/n] (#2689) | Michael Feil | 2024-12-31 | 4 | -3/+182 |
| | | | | | | | | | | | | | | | | | * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace * softcap is working, including test and api. * make softcap test case better --------- Co-authored-by: laurent <laurent.mazare@gmail.com> | ||||
* | Flash-Attn upgrade / SoftCap Candle-FlashAttn [1/n] (#2688) | Michael Feil | 2024-12-31 | 41 | -82/+139 |
| | | | | | | | | | * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace | ||||
* | Bump the crate version to 0.8.1. (#2662) | Laurent Mazare | 2024-12-07 | 1 | -2/+2 |
| | |||||
* | Bump the crate version to 0.8.0. (#2612) | Laurent Mazare | 2024-11-12 | 1 | -2/+2 |
| | |||||
* | Bump the crate version to 0.7.2. (#2517) | Laurent Mazare | 2024-09-29 | 1 | -2/+2 |
| | |||||
* | Move the candle version to 0.7.1. (#2495) | Laurent Mazare | 2024-09-22 | 1 | -2/+2 |
| | |||||
* | Bump the crate version. (#2491) | Laurent Mazare | 2024-09-21 | 1 | -2/+2 |
| | |||||
* | Bump the version to 0.6.1. (#2438) | Laurent Mazare | 2024-08-22 | 1 | -2/+2 |
| | |||||
* | Update the flash attn kernels. (#2333) | Laurent Mazare | 2024-07-15 | 51 | -899/+2274 |
| | |||||
* | Bump the crate version. (#2248) | Laurent Mazare | 2024-06-05 | 1 | -2/+2 |
| | |||||
* | Use flash-attn in gemma. (#2195) | Laurent Mazare | 2024-05-18 | 2 | -1/+7 |
| | | | | | * Use flash-attn in gemma. * Fix flash-attn for head dim 256. | ||||
* | Bump the version number to 0.5.1. (#2155) | Laurent Mazare | 2024-05-03 | 1 | -2/+2 |
| | | | | | | | * Bump the version number to 0.5.1. * Fix clippy lints for 1.78. * More clippy fixes. | ||||
* | Bumping the version number to 0.5.0. (#2009) | Laurent Mazare | 2024-04-04 | 1 | -2/+2 |
| | |||||
* | Bump the crate versions to 0.4.2. (#1821) | Laurent Mazare | 2024-03-08 | 1 | -2/+2 |
| | |||||
* | Bump the version number to 0.4.1. (#1768) | Laurent Mazare | 2024-02-27 | 1 | -2/+2 |
| | | | | | * Fix the block size for some cuda kernels. * Bump the version number to 0.4.1. | ||||
* | Bump the crate version to 0.4.0. (#1658) | Laurent Mazare | 2024-02-04 | 1 | -2/+2 |
| | |||||
* | Explicit version for packages that are not in the workspace. (#1642) | Laurent Mazare | 2024-01-31 | 1 | -1/+1 |
| | |||||
* | Moving to a proper build crate `bindgen_cuda`. (#1531) | Nicolas Patry | 2024-01-07 | 2 | -241/+36 |
| | | | | | * Moving to a proper build crate `bindgen_cuda`. * Fmt. | ||||
* | Unpin more of the workplace relative dependencies. (#1535) | Laurent Mazare | 2024-01-07 | 1 | -2/+2 |
| | |||||
* | chore: update flash attention kernels (#1518) | OlivierDehaene | 2024-01-05 | 28 | -465/+1086 |
| | | | | | | | | | | | * chore: update flash attention kernels * fmt * remove unused kernels * force f32 * correct stride | ||||
* | Bump the crate version to 0.3.3. (#1490) | Laurent Mazare | 2023-12-28 | 1 | -3/+3 |
| | |||||
* | Bump the crate version to 0.3.2. (#1452) | Laurent Mazare | 2023-12-17 | 1 | -3/+3 |
| | |||||
* | Update for 0.3.1. (#1324) | Laurent Mazare | 2023-11-11 | 1 | -3/+3 |
| | |||||
* | Fix for flash-attn. (#1310) | Laurent Mazare | 2023-11-10 | 1 | -2/+2 |
| | | | Co-authored-by: laurent <laurent@par2dc5-ai-prd-cl01dgx02.cm.cluster> | ||||
* | feat: parse Cuda compute cap from env (#1066) | OlivierDehaene | 2023-10-16 | 1 | -36/+52 |
| | | | | | | | | | * feat: add support for multiple compute caps * Revert to one compute cap * fmt * fix | ||||
* | Bump the version to 0.3.0. (#1014) | Laurent Mazare | 2023-10-01 | 1 | -3/+3 |
| | | | | | * Bump the version to 0.3.0. * Changelog update. | ||||
* | Bump the crate versions to v0.2.3. (#886) | Laurent Mazare | 2023-09-18 | 1 | -3/+3 |
| | | | | | * Bump the crate version. * Also update the python bindings. | ||||
* | Bump the crate version + update the changelog. (#822) | Laurent Mazare | 2023-09-12 | 1 | -3/+3 |
| | |||||
* | Shape with holes (#770) | Laurent Mazare | 2023-09-08 | 1 | -3/+6 |
| | | | | | * Shape with holes. * rustfmt. | ||||
* | Add small customization to the build (#768) | Zsombor | 2023-09-08 | 1 | -4/+20 |
| | | | | | | | | | * Add ability to override the compiler used by NVCC from an environment variable * Allow relative paths in CANDLE_FLASH_ATTN_BUILD_DIR * Add the compilation failure to the readme, with a possible solution * Adjust the error message, and remove the special handling of the relative paths | ||||
* | Properly set the is_bf16 flag. (#738) | Laurent Mazare | 2023-09-04 | 1 | -6/+10 |
| | |||||
* | BF16 support for flash-attn. (#737) | Laurent Mazare | 2023-09-04 | 1 | -41/+81 |
| | |||||
* | Add back the bf16 flash-attn kernels. (#730) | Laurent Mazare | 2023-09-04 | 4 | -22/+25 |
| | |||||
* | Add some documentation. (#673) | Laurent Mazare | 2023-08-30 | 1 | -3/+3 |
| | | | | | * Add some documentation. * Bump the crate version. | ||||
* | Bump the crate version + update CHANGELOG. (#628) | Laurent Mazare | 2023-08-27 | 1 | -3/+3 |
| | |||||
* | Add some group parameter to convolutions. (#566) | Laurent Mazare | 2023-08-23 | 1 | -3/+3 |
| | | | | | | | | | | | | | * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog. | ||||
* | Bump the crates version to 0.1.2. (#522) | Laurent Mazare | 2023-08-20 | 1 | -3/+3 |
| | |||||
* | Relax the requirements on CustomOp. (#486) | Laurent Mazare | 2023-08-17 | 1 | -2/+2 |
| | | | | | * Relax the requirements on CustomOp. * Simplify the custom-ops when no backward is required. | ||||
* | add c++17 flags (#452) | Chengxu Yang | 2023-08-15 | 1 | -0/+1 |
| | |||||
* | Rename vec-dot to vec-ops. (#449) | Laurent Mazare | 2023-08-15 | 1 | -3/+3 |
| | | | | | | | * Rename vec-dot to vec-ops. * Also bump the crate version. * Add a currently empty readme. | ||||
* | Add the license files. (#335) | Laurent Mazare | 2023-08-07 | 1 | -1/+1 |
| | |||||
* | Update the repo location. (#305) | Laurent Mazare | 2023-08-02 | 1 | -1/+1 |
| | |||||
* | Add some missing readme files. (#304) | Laurent Mazare | 2023-08-02 | 1 | -0/+1 |
| | |||||
* | Add version numbers for all the candle crates (#303) | Laurent Mazare | 2023-08-02 | 1 | -2/+2 |
| | | | | | * Switch to candle-gemm for the time being. * Add the missing versions. | ||||
* | Rename the candle crate to candle-core (#301) | Laurent Mazare | 2023-08-02 | 1 | -1/+1 |
| | | | | | * Rename to candle-core. * More candle-core renaming. | ||||
* | Fix the flash-attention function names. (#282) | Laurent Mazare | 2023-07-31 | 1 | -2/+2 |
| | |||||
* | Flash attention without padding (varlen). (#281) | Laurent Mazare | 2023-07-31 | 4 | -4/+283 |
| | | | | | | | | | | | | | * Expose the seqlen variable for flash-attn without padding. * Fix the batched call. * Adapt for the varlen variant. * No need to set the batch strides when in varlen mode. * Add a test (disabled at the moment). * Get the test to work properly. | ||||
* | Softmax numerical stability. (#267) | Laurent Mazare | 2023-07-28 | 2 | -1/+2 |
| | | | | | * Softmax numerical stability. * Fix the flash-attn test. |