summaryrefslogtreecommitdiff
path: root/candle-transformers/src/models/mimi/mod.rs
diff options
context:
space:
mode:
authorzachcp <zachcp@users.noreply.github.com>2024-11-18 08:19:23 -0500
committerGitHub <noreply@github.com>2024-11-18 14:19:23 +0100
commit386fd8abb4be23c125e8100fed932f17d356a160 (patch)
treed4964322db768d31e2e4c1949848315fcfd7cfa2 /candle-transformers/src/models/mimi/mod.rs
parent12d7e7b1450f0c3f87c3cce3a2a1dd1674cb8fd7 (diff)
downloadcandle-386fd8abb4be23c125e8100fed932f17d356a160.tar.gz
candle-386fd8abb4be23c125e8100fed932f17d356a160.tar.bz2
candle-386fd8abb4be23c125e8100fed932f17d356a160.zip
Module Docs (#2624)
* update whisper * update llama2c * update t5 * update phi and t5 * add a blip model * qlamma doc * add two new docs * add docs and emoji * additional models * openclip * pixtral * edits on the model docs * update yu * update a fe wmore models * add persimmon * add model-level doc * names * update module doc * links in heira * remove empty URL * update more hyperlinks * updated hyperlinks * more links * Update mod.rs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
Diffstat (limited to 'candle-transformers/src/models/mimi/mod.rs')
-rw-r--r--candle-transformers/src/models/mimi/mod.rs24
1 files changed, 21 insertions, 3 deletions
diff --git a/candle-transformers/src/models/mimi/mod.rs b/candle-transformers/src/models/mimi/mod.rs
index f19f9ae5..8945abfb 100644
--- a/candle-transformers/src/models/mimi/mod.rs
+++ b/candle-transformers/src/models/mimi/mod.rs
@@ -1,9 +1,27 @@
//! mimi model
//!
-//! Mimi is a state-of-the-art audio neural codec.
+//! [Mimi](https://huggingface.co/kyutai/mimi) is a state of the art audio
+//! compression model using an encoder/decoder architecture with residual vector
+//! quantization. The candle implementation supports streaming meaning that it's
+//! possible to encode or decode a stream of audio tokens on the flight to provide
+//! low latency interaction with an audio model.
//!
-//! - [HuggingFace Model Card](https://huggingface.co/kyutai/mimi)
-//! - [GitHub](https://github.com/kyutai-labs/moshi)
+//! - 🤗 [HuggingFace Model Card](https://huggingface.co/kyutai/mimi)
+//! - 💻 [GitHub](https://github.com/kyutai-labs/moshi)
+//!
+//!
+//! # Example
+//! ```bash
+//! # Generating some audio tokens from an audio files.
+//! wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
+//! cargo run --example mimi \
+//! --features mimi --release -- \
+//! audio-to-code bria.mp3 bria.safetensors
+//!
+//! # And decoding the audio tokens back into a sound file.
+//! cargo run --example mimi
+//! --features mimi --release -- \
+//! code-to-audio bria.safetensors bria.wav
//!
// Copyright (c) Kyutai, all rights reserved.