summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Update memmap2 requirement from 0.7.1 to 0.9.3 (#1556)dependabot[bot]2024-01-101-1/+1
| | | | | | | | | | | | | | Updates the requirements on [memmap2](https://github.com/RazrFalcon/memmap2-rs) to permit the latest version. - [Changelog](https://github.com/RazrFalcon/memmap2-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/RazrFalcon/memmap2-rs/compare/v0.7.1...v0.7.1) --- updated-dependencies: - dependency-name: memmap2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Update gloo requirement from 0.8 to 0.11 (#1558)dependabot[bot]2024-01-105-5/+5
| | | | | | | | | | | | | | | Updates the requirements on [gloo](https://github.com/rustwasm/gloo) to permit the latest version. - [Release notes](https://github.com/rustwasm/gloo/releases) - [Changelog](https://github.com/rustwasm/gloo/blob/master/CHANGELOG.md) - [Commits](https://github.com/rustwasm/gloo/commits) --- updated-dependencies: - dependency-name: gloo dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Update cudarc requirement from 0.9.14 to 0.10.0 (#1559)dependabot[bot]2024-01-101-1/+1
| | | | | | | | | | | | | | Updates the requirements on [cudarc](https://github.com/coreylowman/cudarc) to permit the latest version. - [Release notes](https://github.com/coreylowman/cudarc/releases) - [Commits](https://github.com/coreylowman/cudarc/compare/v0.9.14...v0.9.15) --- updated-dependencies: - dependency-name: cudarc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Update tokenizers requirement from 0.13.4 to 0.15.0 (#1555)dependabot[bot]2024-01-101-1/+1
| | | | | | | | | | | | | | | Updates the requirements on [tokenizers](https://github.com/huggingface/tokenizers) to permit the latest version. - [Release notes](https://github.com/huggingface/tokenizers/releases) - [Changelog](https://github.com/huggingface/tokenizers/blob/main/RELEASE.md) - [Commits](https://github.com/huggingface/tokenizers/commits) --- updated-dependencies: - dependency-name: tokenizers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix: deprecated option field (open-pull-requests-limit-per-dependency) (#1554)darker2024-01-101-1/+0
|
* feat: add dependabot to the project (#1553)darker2024-01-101-0/+8
| | | | | | | | | * feat: add dependabot to the project * feat: add let's accept patches/fix from other libs * Revert "feat: add let's accept patches/fix from other libs" This reverts commit d31a956f8108afb1b6ee6f35611feea399d63bdf.
* Handle start-offset when loading a tensor from a pickle file. (#1546)Laurent Mazare2024-01-081-3/+11
|
* Simpler repro for the neon optimization issue + bugfix (#1544)Laurent Mazare2024-01-072-168/+97
| | | | | | | | | | | | | | | | | * Simpler repro for the neon optimization issue. * Bugfix for q4k. * Improve the fix, share the dot-prod bit. * Clippy fixes. * Fix for q6k. * Also fix for q2k. * Use the new shared dotprod. * Add more testing.
* Use bindgen-cuda for the custom-kernel example. (#1536)Laurent Mazare2024-01-074-236/+20
| | | | | | | * Use bindgen-cuda for the custom-kernel example. * Only depend on the kernels when cuda is enabled. * Skip rustfmt.
* Moving to a proper build crate `bindgen_cuda`. (#1531)Nicolas Patry2024-01-074-483/+41
| | | | | * Moving to a proper build crate `bindgen_cuda`. * Fmt.
* Unpin more of the workplace relative dependencies. (#1535)Laurent Mazare2024-01-072-4/+4
|
* Simplifying our internal cargo dependencies. (#1529)Nicolas Patry2024-01-0718-48/+55
|
* fix index_pos bug when kv cache is disabled. (#1517)optman2024-01-061-4/+4
| | | | | | | | | * fix index_pos bug when kv cache is disabled * Tweak the fix. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
* chore: update flash attention kernels (#1518)OlivierDehaene2024-01-0528-465/+1086
| | | | | | | | | | | * chore: update flash attention kernels * fmt * remove unused kernels * force f32 * correct stride
* add link to gpt-from-scratch-rs (#1525)Jeroen Vlek2024-01-051-0/+1
|
* Adding bfloat16 support for the cast kernels. (#1520)Nicolas Patry2024-01-042-0/+6
|
* Simplify the one-hot implementation, support arbitrary rank. (#1514)Laurent Mazare2024-01-011-181/+38
| | | | | * Simplify the one-hot implementation, support arbitrary rank. * More cleanup.
* Add one-hot/cold encoding (#1489)Ryan Tate2024-01-013-0/+414
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * add one-hot encoding * one_hot: improve error handling, use generic to_vecN::<D> Bails if the index value is equal to or greater than the depth value, which would result in an out-of-bounds error. A redundant check is added to ensure the index value does not exceed the length of the one-hot matrix size, which would also result in an out-of-bounds error. Bails if the index value is less than -1. If the index value is -1, then it ignores the setting of the on_value for the index value. Only values that are less than -1 are considered errors. * one-hot: use two generics, one_hot::<I, O>, for input and output data types Separating the input and output data types allows the input tensor indices to be a different data type than the output encoded tensor data type. For example, one_hot::<i64, u8>(...) will take an input tensor of i64 values and encode the output tensor using u8 values. The generic I::DTYPE must match the data type of the input indices, otherwise the method will bail. Additionally, this method adds an `allow_f64` option to enable the input indices data type to be f64 values. f64 values are disabled by default. TODO: indices data type and the generic I data type are currently not compile-time checked. * one_hot: remove input generic, use indices dtype matching This commit removes the to_f64() type cast and explicitly matches the DType from the input tensor. Currently, only U8, U32 and I64 is supported for input tensors. The match arms on the dtype is verbose. It would be nice to use a generic type with the WithDtype traitbound to pass to the to_vecN method and then return an inner value. Open to suggestions for better approaches here to reduce the match arm verbosity. * one_hot: use flat_map iterator over dims instead of nested for loop This commit replaces the nested for loops with an flat map iter over the dimensions of the input tensor. This commit also adds a test for a rank 3 input tensor. * one_hot: use mandatory on/off-values, remove const msgs This commit also updates doc tests, comments and test cases. * Small cleanups. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
* Format properly the Stable Diffusion example run with params (#1511)stano2024-01-011-1/+1
| | | Move out the --sd-version flag out of the prompt.
* Do not implement Module for BatchNorm. (#1513)Laurent Mazare2024-01-019-33/+31
|
* Add support for tiny-llama-1.1b. (#1512)Laurent Mazare2023-12-311-2/+9
|
* Small tweaks to batch-norm. (#1505)Laurent Mazare2023-12-301-19/+16
|
* [Breaking] Add training to batchnorm with exponential moving average (#1504)nkoppel2023-12-302-50/+169
| | | | | | | | | | | | | | | | | * Add training to batchnorm with exponential moving average * Add more checks to batch norm * Resolve some review comments * Add with_momentum varients of `new` methods * Add check for range of momentum variable; update batch norm test * Run cargo fmt * Add back num_features parameter * Format; tiny simplification
* Add Policy Gradient to Reinforcement Learning examples (#1500)s-casci2023-12-304-124/+275
| | | | | | | | | | | | | | | * added policy_gradient, modified main, ddpg and README * fixed typo in README * removed unnecessary imports * small refactor * Use clap for picking up the subcommand to run. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Metal: support unary abs (#1503)Gonzalo2023-12-303-1/+9
| | | | | * Metal: support unary abs * cargo fmt
* Metal: more u8/u32 (#1502)Gonzalo2023-12-295-4/+68
| | | | | * Adds more metal u8 * Metal: more u32
* Metal: i64 basic support (#1495)Gonzalo2023-12-297-1/+83
| | | | | * Adds basic metal i64 support * metal copy i64
* Merge pull request #1498 from huggingface/debugging_windows_ciNicolas Patry2023-12-292-4/+9
|\ | | | | Fix CI
| * Ignore skipped.Nicolas Patry2023-12-291-1/+1
| |
| * Ignore stop on remote forks.Nicolas Patry2023-12-291-1/+1
| |
| * Fix.Nicolas Patry2023-12-291-1/+4
| |
| * Trying to fix flakyness by making hub_2 and hub_3 serial tests (potential ↵Nicolas Patry2023-12-292-6/+4
| | | | | | | | issue on mingw with mmap).
| * Fix the CI.Nicolas Patry2023-12-292-0/+4
| |
* | Merge pull request #1496 from bayedieng/unaryNicolas Patry2023-12-293-3/+8
|\ \ | | | | | | Implement urecip op for metal backend
| * | fix bad pattern matching and function nameBaye Dieng2023-12-293-6/+6
| | |
| * | remove generated pngBaye Dieng2023-12-281-0/+0
| | |
| * | add urecip op to metal backendBaye Dieng2023-12-284-3/+8
| |/
* | Merge pull request #1491 from mimiquate/metal-errorsNicolas Patry2023-12-291-28/+42
|\ \ | |/ |/| Improves metal's not implemented error messages
| * fixes error messageGonzalo2023-12-281-1/+1
| |
| * cargo fmtGonzalo2023-12-281-7/+21
| |
| * Improves metal's not implemented error messagesGonzalo2023-12-281-27/+27
| |
* | Fix lints for clippy 1.75. (#1494)Laurent Mazare2023-12-286-40/+38
| |
* | add config_amazon_mistral_lite (#1493)Daniel Clough2023-12-281-0/+18
|/ | | Co-authored-by: Ubuntu <danielclough@users.noreply.github.com>
* Bump the crate version to 0.3.3. (#1490)Laurent Mazare2023-12-2821-55/+55
|
* Add some mention to SOLAR-10.7B in the readme. (#1487)Laurent Mazare2023-12-271-2/+3
|
* Rework the llama example config, add the solar model. (#1485)Laurent Mazare2023-12-261-72/+36
|
* Use the new hub helper function. (#1484)Laurent Mazare2023-12-262-16/+2
|
* Helper function to load sharded safetensors files (#1481)Laurent Mazare2023-12-257-67/+40
| | | | | | | * Fix the quantized mistral example. * Add a helper function to load sharded safetensors weights. * Use the sharded loader.
* Merge pull request #1479 from huggingface/upsample_metalNicolas Patry2023-12-253-2/+137
|\ | | | | Adding upsample_nearest_2d.
| * Adding upsample_nearest_2d.Nicolas Patry2023-12-253-2/+137
|/