diff options
author | Jorge António <matroid@outlook.com> | 2024-10-07 16:30:56 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-10-07 17:30:56 +0200 |
commit | edf7668291a30d6c73dd0fb884a74d1d78e5786d (patch) | |
tree | f3ac50d3adfb1006ca89a72a05fb6baa006ef1c7 | |
parent | e4a96f9e7c2b88dec33b6076cc9756ac76d44df1 (diff) | |
download | candle-edf7668291a30d6c73dd0fb884a74d1d78e5786d.tar.gz candle-edf7668291a30d6c73dd0fb884a74d1d78e5786d.tar.bz2 candle-edf7668291a30d6c73dd0fb884a74d1d78e5786d.zip |
improve (#2548)
-rw-r--r-- | README.md | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -187,6 +187,7 @@ And then head over to - [`candle-sampling`](https://github.com/EricLBuehler/candle-sampling): Sampling techniques for Candle. - [`gpt-from-scratch-rs`](https://github.com/jeroenvlek/gpt-from-scratch-rs): A port of Andrej Karpathy's _Let's build GPT_ tutorial on YouTube showcasing the Candle API on a toy problem. - [`candle-einops`](https://github.com/tomsanbear/candle-einops): A pure rust implementation of the python [einops](https://github.com/arogozhnikov/einops) library. +- [`atoma-infer`](https://github.com/atoma-network/atoma-infer): A Rust library for fast inference at scale, leveraging FlashAttention2 for efficient attention computation, PagedAttention for efficient KV-cache memory management, and multi-GPU support. It is OpenAI api compatible. If you have an addition to this list, please submit a pull request. |