summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Make the Python Wrapper more Hackable and simplify Quantization (#1010)Lukas Kreussel2023-10-0625-182/+2426
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Some first `Module` implementations * Add `state_dict` and `load_state_dict` functionality * Move modules around and create `candle.nn.Linear` * Add `nn.Embedding` and `nn.LayerNorm` * Add BERT implementation * Batch q-matmul * Automatically dequantize `QTensors` if a `Tensor` is expected * Add Module `.to()`, `.cuda()`, `cpu()` and `.type()` functionality * Unittests for `Module`, `Tensor` and `candle.utils` * Add `pytorch` like slicing to `Tensor` * Cleanup and BERT fixes * `black` formatting + unit-test for `nn.Linear` * Refactor slicing implementation
* Sketch the stable-lm model. (#1045)Laurent Mazare2023-10-062-0/+365
|
* Remove some todos. (#1042)Laurent Mazare2023-10-052-6/+6
|
* Add the clamping for stable-diffusion. (#1041)Laurent Mazare2023-10-052-4/+4
|
* feat: [SAM] able to download the result as png (#1035)lichin-lin2023-10-051-0/+60
| | | | | * feat: able to download the result as png * feat: update function and wording
* Add the round-to function. (#1039)Laurent Mazare2023-10-052-0/+18
|
* Delete invalid comment (#1038)Juarez Bochi2023-10-052-4/+0
|
* fix: fix index_select cuda kernel for src target dim different than ids dim ↵Gonzalo2023-10-053-8/+21
| | | | | | | when selecting dim > 0 (#1037) * fix: fix index_select cuda kernel for src target dim different than ids dim when selecting dim > 0 * cargo fmt
* Use AsRef<str> for set_one. (#1033)Laurent Mazare2023-10-051-1/+1
|
* Quant t5: Add coedit model to wasm demo and readme (#1031)Juarez Bochi2023-10-043-5/+70
|
* Whisper quantized wasm (#1028)Radamés Ajna2023-10-0413-599/+543
| | | | | | | | | | | | | * [Whisper] Update to use quantized model * [whisper] add language detection * [whisper] change assets location * [whisper] adapt js example with quantized models * [whisper] better task parsing * [whisper] minor fixes
* Add the rounding operators. (#1030)Laurent Mazare2023-10-046-0/+157
| | | | | | | * Add the rounding operators. * Avoid tracking gradients for the rounding operations. * Add some rounding tests.
* Add quantized t5 args for weight and config (#1029)Juarez Bochi2023-10-041-9/+24
|
* Simd128 optimized q8k vecdot. (#1026)Laurent Mazare2023-10-033-1/+34
|
* AVX optimized q8k vecdot. (#1024)Laurent Mazare2023-10-033-0/+44
|
* Merge pull request #1023 from evgenyigumnov/simlified-book-polishNicolas Patry2023-10-031-2/+2
|\ | | | | small misspeling and polish fix
| * small misspeling and polish fixEvgeny Igumnov2023-10-031-2/+2
|/
* Fix for the index-select cuda setup. (#1022)Laurent Mazare2023-10-032-1/+16
| | | | | * Fix for index-select. * Better fix + add some testing.
* Merge pull request #926 from evgenyigumnov/book-trainin-simplifiedNicolas Patry2023-10-035-2/+247
|\ | | | | Book train simlified example
| * Fix include code.Nicolas Patry2023-10-021-4/+2
| |
| * Fixed PR warnings.Nicolas Patry2023-10-023-4/+6
| |
| * Merge branch 'main' into book-trainin-simplifiedEvgeny Igumnov2023-09-22195-1790/+14760
| |\
| * | https://github.com/huggingface/candle/issues/637Evgeny Igumnov2023-09-224-0/+245
| | |
* | | [SAM] Add undo button and background point mode (#1020)Radamés Ajna2023-10-021-61/+165
| | | | | | | | | | | | | | | | | | | | | | | | | | | * [SAM] Add undo button and background point mode * [SAM] remove pts on near clicks * [SAM] check shiftKey toggle point mode * [SAM] clear points when clearing image
* | | neon optimized q8k multiplication. (#1021)Laurent Mazare2023-10-022-3/+36
| | | | | | | | | | | | | | | | | | | | | * neon optimized q8k multiplication. * Bugfixes. * simdification.
* | | Add the q8k vec-dot multiplication. (#1019)Laurent Mazare2023-10-022-2/+46
| | |
* | | Improve the quantized whisper setup. (#1018)Laurent Mazare2023-10-028-49/+66
| | | | | | | | | | | | | | | | | | | | | * Improve the quantized whisper setup. * Fix the config file paths. * Use the standard matmul where possible.
* | | Add a quantized variant of whisper (#1017)Laurent Mazare2023-10-025-62/+519
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add the quantized-whisper model. * Quantized the whisper model. * Adapt the whisper example to handle quantization. * Add the quantized flag. * Load the proper weights.
* | | Improve the testing of the optimized quantized vec-dot ops (#1016)Laurent Mazare2023-10-023-5/+75
| | | | | | | | | | | | | | | * Expose the unopt functions for testing. * Better testing of the optimized quantized computations.
* | | Simd128 version of q6k vec-dot. (#1015)Laurent Mazare2023-10-012-1/+127
| | | | | | | | | | | | | | | | | | | | | * Add a specific function for the simd128 q6k vec-dot. * Simdification. * More simdification.
* | | [segment-anything] add multi point logic for demo site (#1002)lichin-lin2023-10-013-14/+29
| | | | | | | | | | | | | | | * [segment-anything] add multi point logic for demo site * [segment-anything] remove libs and update functions
* | | Bump the version to 0.3.0. (#1014)Laurent Mazare2023-10-0120-65/+71
| | | | | | | | | | | | | | | * Bump the version to 0.3.0. * Changelog update.
* | | Fix the prompt for mistral when using instruct/interactive mode. (#1013)Laurent Mazare2023-10-011-12/+31
| | |
* | | Integrate TheBloke quantized mistral weights. (#1012)Laurent Mazare2023-09-301-2/+26
| | |
* | | Simd128 version of the q2k-q8k vecdot product. (#1011)Laurent Mazare2023-09-304-47/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Sketch the simd128 version of q2k vecdot. * Use a single accumulator. * Simdify the q2k-q8k vecdot product. * Cosmetic change.
* | | Quantized version of mistral. (#1009)Laurent Mazare2023-09-307-37/+507
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.
* | | Streaming mode for reporting the generated tokens (#1007)Laurent Mazare2023-09-304-11/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Token streaming. * Use the token output stream. * Flush the output. * Ensure that the last characters get reported.
* | | Use flash-attn for mistral. (#1004)Laurent Mazare2023-09-302-9/+41
| | |
* | | Mistral: exit on eos token. (#1001)Laurent Mazare2023-09-302-10/+17
| | | | | | | | | | | | | | | | | | | | | * Mistral: exit on eos token. * Print the proper stats. * Also add a short flag.
* | | Add negative prompts to segment-anything. (#1000)Laurent Mazare2023-09-303-18/+34
| | |
* | | [segment-anything] Print IOU values to help with debugging (#999)GeauxEric2023-09-301-1/+1
| | |
* | | Fix the multiple points case for sam. (#998)Laurent Mazare2023-09-291-2/+2
| | |
* | | Add an entry about WSL slowness to the faq. (#997)Laurent Mazare2023-09-291-0/+5
| | |
* | | fix: add missing gpu fill_* (#996)Gonzalo2023-09-292-0/+35
| | |
* | | Update mistral README.md (#995)Laurent Mazare2023-09-291-2/+5
| | |
* | | Mistral readme (#994)Laurent Mazare2023-09-292-0/+40
| | | | | | | | | | | | | | | * Mistral: print the generated text. * Add mistral to the readmes.
* | | Mistral: print the generated text. (#992)Laurent Mazare2023-09-291-4/+4
| | |
* | | fixes slice_scatter dim type (#988)Gonzalo2023-09-291-1/+1
| | |
* | | Use a silu activation in mistral. (#991)Laurent Mazare2023-09-292-1/+5
| | |
* | | Add the sliding window. (#986)Laurent Mazare2023-09-281-2/+9
| | |