| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
* update flash-attn v1
* restore: hdim224
* add 224 flash_fwd_template
* remove whitespace
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* chore: update flash attention kernels
* fmt
* remove unused kernels
* force f32
* correct stride
|
|
* Again set a few extra params.
* Use the appropriate kernel sizes.
* Add all the kernel sizes.
* Parallel compiling.
* Reduce the amount of parallelism.
* Add the missing kernel.
* Fix a typo.
* Remove bf16 support for now.
|