| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
* Clippy fixes.
* Bump the web_sys required version.
|
| |
|
|
|
|
|
|
|
| |
* Bump the version number to 0.5.1.
* Fix clippy lints for 1.78.
* More clippy fixes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Metal quantized modifications proposal.
- Add a device param, wherever needed.
- Create new QMetal storage thing that implements QuantizedType.
- Update everywhere needed.
Fix Python.
Fixing examples.
Fix: fmt + clippy + stub.
Moving everything around.
Only missing the actual implems.
Fixing everything + adding dequantized kernels.
More work.
Fixing matmul.
Fmt + Clippy
Some clippy fixes.
Working state.
Q2K Metal -> Bugged (also present in GGML).
Q4K CPU -> Bugged (present previously, new test catch it).
Q5K CPU -> Bugged (present previously).
Q8_1 Both -> Never really implemented it seems
Q8K metal -> Never implemented in metal
Fixing Q2K bug (present in ggml).
* Cleanup.
* Fix the rebase.
* Removing the fences speeds everything up and *is* correct this time...
* Cleanup the fence.
* After rebase.
* Bad code removal.
* Rebase after phi2 merge + fix replit default to CPU.
* Making the CI happy.
* More happy tests.
---------
Co-authored-by: Nicolas Patry <nicolas@Nicolass-MacBook-Pro.local>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updates the requirements on [gloo](https://github.com/rustwasm/gloo) to permit the latest version.
- [Release notes](https://github.com/rustwasm/gloo/releases)
- [Changelog](https://github.com/rustwasm/gloo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rustwasm/gloo/commits)
---
updated-dependencies:
- dependency-name: gloo
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
* Mixtral quantized instruct.
* Fix a couple typos.
|
|
|
|
|
| |
* Use the whisper-v3 tokenizer now that it has been added.
* Use the appropriate nospeech token.
|
|
|
|
| |
- clippy::needless-borrows-for-generic-args
- clippy::reserve-after-initialization
|
| |
|
|
|
|
|
| |
* Preliminary support for whisper v3.
* Add the missing files.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* [Whisper] Update to use quantized model
* [whisper] add language detection
* [whisper] change assets location
* [whisper] adapt js example with quantized models
* [whisper] better task parsing
* [whisper] minor fixes
|
|
|
|
|
| |
* Bump the version to 0.3.0.
* Changelog update.
|
| |
|
|
|
|
|
| |
* Bump the crate version.
* Also update the python bindings.
|
|
|
|
|
|
|
| |
* fixes
* remove listener
* remove event listener
|
| |
|
| |
|
|
|
|
|
|
|
| |
* add stats
* random seed btn
* minor ui improvoments
|
| |
|
| |
|
|
|
|
|
| |
* Add the kv-cache to the whisper wasm version.
* Improve the handling of special tokens.
|
|
|
|
|
|
|
|
|
|
|
| |
* wip add module and js worker example
* params
* clean up, send error
* final UI with whisper webworker
* add simple instructions
|
|
|
|
|
| |
* Add some documentation.
* Bump the crate version.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add the dilation parameter.
* Restore the basic optimizer example.
* Dilation support in cudnn.
* Use the dilation parameter in the cpu backend.
* More dilation support.
* No support for dilation in transposed convolutions.
* Add dilation to a test.
* Remove a print.
* Helper function.
|
|
|
|
|
|
|
|
|
| |
* Remove some dead-code annotations.
* More dead code removal.
* One more.
* CI fix.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add some group parameter to convolutions.
* Avoid some unnecessary groups checks.
* Move the tensor convolution bits.
* Properh handling of groups.
* Bump the crate version.
* And add a changelog.
|
| |
|
|
|
|
|
|
|
| |
* Start adding the module trait.
* Use the module trait.
* Implement module for qmatmul.
|
|
|
|
|
|
|
| |
* Rename vec-dot to vec-ops.
* Also bump the crate version.
* Add a currently empty readme.
|
|
|
|
|
| |
* Add a cuda kernel for upsampling.
* Update for the latest tokenizers version.
|
| |
|
|
|
|
|
| |
* Switch to candle-gemm for the time being.
* Add the missing versions.
|
|
|
|
|
| |
* Rename to candle-core.
* More candle-core renaming.
|
|
|
|
|
| |
* Softmax numerical stability.
* Fix the flash-attn test.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
- Add ms/token on llama2.c (15ms/token on my personal machine)
- Hide `Run` buttons while models are not ready
- Add dummy `progress` while weights are downloading (I briefly looked
at putting a real progressbar.. and nothing easy enough came up.)
|
|
* Move the whisper example.
* More renaming.
* Add llama2 as a new wasm example.
* Live generation.
* More of the llama wasm example.
* Formatting.
|