summaryrefslogtreecommitdiff
path: root/candle-book
diff options
context:
space:
mode:
authorNicolas Patry <patry.nicolas@protonmail.com>2023-08-14 11:09:48 +0200
committerNicolas Patry <patry.nicolas@protonmail.com>2023-08-28 15:14:43 +0200
commitdd02f589c095790f980cc8fb84f411d67b7e3c21 (patch)
tree2b58c8f41cc18462b43eabf314a6a088f2a58df7 /candle-book
parent76023236677fbab10fd1c99eab95d268416fb941 (diff)
downloadcandle-dd02f589c095790f980cc8fb84f411d67b7e3c21.tar.gz
candle-dd02f589c095790f980cc8fb84f411d67b7e3c21.tar.bz2
candle-dd02f589c095790f980cc8fb84f411d67b7e3c21.zip
Better training+hub
Diffstat (limited to 'candle-book')
-rw-r--r--candle-book/src/training/README.md32
1 files changed, 28 insertions, 4 deletions
diff --git a/candle-book/src/training/README.md b/candle-book/src/training/README.md
index f4f9eb85..ddbbc7af 100644
--- a/candle-book/src/training/README.md
+++ b/candle-book/src/training/README.md
@@ -6,12 +6,36 @@ start with the Hello world dataset of machine learning, MNIST.
Let's start with downloading `MNIST` from [huggingface](https://huggingface.co/datasets/mnist).
-
-```rust
-use candle_datasets::from_hub;
+This requires `candle-datasets` with the `hub` feature.
+```bash
+cargo add candle-datasets --features hub
+cargo add hf-hub
+```
-let dataset = from_hub("mnist")?;
+```rust,ignore
+{{#include ../../../candle-examples/src/lib.rs:book_training_1}}
```
This uses the standardized `parquet` files from the `refs/convert/parquet` branch on every dataset.
+`files` is now a `Vec` of [`parquet::file::serialized_reader::SerializedFileReader`].
+
+We can inspect the content of the files with:
+
+```rust,ignore
+{{#include ../../../candle-examples/src/lib.rs:book_training_2}}
+```
+
+You should see something like:
+
+```bash
+Column id 1, name label, value 6
+Column id 0, name image, value {bytes: [137, ....]
+Column id 1, name label, value 8
+Column id 0, name image, value {bytes: [137, ....]
+```
+
+So each row contains 2 columns (image, label) with image being saved as bytes.
+Let's put them into a useful struct.
+
+