summaryrefslogtreecommitdiff
path: root/candle-pyo3
Commit message (Collapse)AuthorAgeFilesLines
* Lint fixes introduced with Rust 1.83 (#2646)Anubhab Bandyopadhyay2024-11-281-1/+1
| | | | | | | | | | | * Fixes for lint errors introduced with Rust 1.83 * rustfmt * Fix more lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Remove some unused macros. (#2618)Laurent Mazare2024-11-151-1/+1
| | | | | * Remove some unused macros. * More unused fixes.
* pyo3 update. (#2545)Laurent Mazare2024-10-064-25/+20
| | | | | * pyo3 update. * Stub fix.
* Update for pyo3 0.21. (#1985)Laurent Mazare2024-04-015-47/+70
| | | | | | | | | | | * Update for pyo3 0.21. * Also adapt the RL example. * Fix for the pyo3-onnx bindings... * Print details on failures. * Revert pyi.
* Expose candle gather op in pyo3. (#1870)Laurent Mazare2024-03-182-0/+12
|
* Detach the tensors on batch-norm eval. (#1702)Laurent Mazare2024-02-137-15/+90
| | | | | | | | | | | | | * Detach the tensors on batch-norm eval. * Fix pyo3 bindings. * Black tweak. * Formatting. * Also update the pyo3-onnx formatting. * Apply black.
* Quantized GGUF style (#1523)Nicolas Patry2024-01-172-22/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Metal quantized modifications proposal. - Add a device param, wherever needed. - Create new QMetal storage thing that implements QuantizedType. - Update everywhere needed. Fix Python. Fixing examples. Fix: fmt + clippy + stub. Moving everything around. Only missing the actual implems. Fixing everything + adding dequantized kernels. More work. Fixing matmul. Fmt + Clippy Some clippy fixes. Working state. Q2K Metal -> Bugged (also present in GGML). Q4K CPU -> Bugged (present previously, new test catch it). Q5K CPU -> Bugged (present previously). Q8_1 Both -> Never really implemented it seems Q8K metal -> Never implemented in metal Fixing Q2K bug (present in ggml). * Cleanup. * Fix the rebase. * Removing the fences speeds everything up and *is* correct this time... * Cleanup the fence. * After rebase. * Bad code removal. * Rebase after phi2 merge + fix replit default to CPU. * Making the CI happy. * More happy tests. --------- Co-authored-by: Nicolas Patry <nicolas@Nicolass-MacBook-Pro.local>
* Simplifying our internal cargo dependencies. (#1529)Nicolas Patry2024-01-071-3/+3
|
* Bump the crate version to 0.3.3. (#1490)Laurent Mazare2023-12-281-3/+3
|
* Bump the crate version to 0.3.2. (#1452)Laurent Mazare2023-12-171-3/+3
|
* Fix a couple typos (#1451)Laurent Mazare2023-12-176-18/+18
| | | | | * Mixtral quantized instruct. * Fix a couple typos.
* Implement the module trait directly for QMatMul. (#1372)Laurent Mazare2023-11-251-1/+1
|
* Update for 0.3.1. (#1324)Laurent Mazare2023-11-111-3/+3
|
* Metal part 1 - Scaffolding for metal. (#1308)Nicolas Patry2023-11-101-0/+13
| | | | | * Metal part 1 - Scaffolding for metal. * Remove tracing.
* PyO3: Add optional `candle.onnx` module (#1282)Lukas Kreussel2023-11-086-3/+334
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Start onnx integration * Merge remote-tracking branch 'upstream/main' into feat/pyo3-onnx * Implement ONNXModel * `fmt` * add `onnx` flag to python ci * Pin `protoc` to `25.0` * Setup `protoc` in wheel builds * Build wheels with `onnx` * Install `protoc` in manylinux containers * `apt` -> `yum` * Download `protoc` via bash script * Back to `manylinux: auto` * Disable `onnx` builds for linux
* PyO3: Add `equal` and `__richcmp__` to `candle.Tensor` (#1099)Lukas Kreussel2023-10-306-2/+324
| | | | | | | | | | | | | | | | | | | * add `equal` to tensor * add `__richcmp__` support for tensors and scalars * typo * more typos * Add `abs` + `candle.testing` * remove duplicated `broadcast_shape_binary_op` * `candle.i16` => `candle.i64` * `tensor.nelements` -> `tensor.nelement` * Cleanup `abs`
* PyO3: Better shape handling (#1143)Lukas Kreussel2023-10-299-49/+180
| | | | | | | | | | | * Negative and `*args` shape handling * Rename to `PyShapeWithHole` + validate that only one hole exists * Regenerate stubs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* PyO3: Add CI to build & upload wheels as artifacts. (#1215)Lukas Kreussel2023-10-291-1/+1
| | | | | | | * Add maturin ci * fix paths * Change sdist path
* convert pytorch's tensor in Python API (#1172)andrew2023-10-253-0/+43
| | | | | * convert pytorch's tensor * separate tests for convert pytorch tensor
* Add support for accelerate in the pyo3 bindings. (#1167)Laurent Mazare2023-10-243-1/+11
|
* PyO3: Add `mkl` support (#1159)Lukas Kreussel2023-10-233-12/+41
| | | | | * Add `mkl` support * Set `mkl` path on linux
* PyO3: Add CI (#1135)Lukas Kreussel2023-10-201-3/+4
| | | | | | | * Add PyO3 ci * Update python.yml * Format `bert.py`
* PyO3: Add `None` and `Tensor` indexing to `candle.Tensor` (#1098)Lukas Kreussel2023-10-202-32/+132
| | | | | * Add proper `None` and `tensor` indexing * Allow indexing via lists + allow tensor/list indexing outside of first dimension
* PyO3: Add pytorch like `.to()` operator to `candle.Tensor` (#1100)Lukas Kreussel2023-10-193-0/+176
| | | | | * add `.to()` operator * Only allow each value to be provided once via `args` or `kwargs`
* Extend `stub.py` to accept external typehinting (#1102)Lukas Kreussel2023-10-177-4/+146
|
* Always broadcast magic methods (#1101)Lukas Kreussel2023-10-172-4/+77
|
* Add the pooling operators to the pyo3 layer. (#1086)Laurent Mazare2023-10-133-0/+40
|
* Use an attention mask in the e5 padding case. (#1085)Laurent Mazare2023-10-132-11/+26
|
* Typos. (#1084)Laurent Mazare2023-10-131-2/+2
|
* Make the Python Wrapper more Hackable and simplify Quantization (#1010)Lukas Kreussel2023-10-0624-182/+2415
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Some first `Module` implementations * Add `state_dict` and `load_state_dict` functionality * Move modules around and create `candle.nn.Linear` * Add `nn.Embedding` and `nn.LayerNorm` * Add BERT implementation * Batch q-matmul * Automatically dequantize `QTensors` if a `Tensor` is expected * Add Module `.to()`, `.cuda()`, `cpu()` and `.type()` functionality * Unittests for `Module`, `Tensor` and `candle.utils` * Add `pytorch` like slicing to `Tensor` * Cleanup and BERT fixes * `black` formatting + unit-test for `nn.Linear` * Refactor slicing implementation
* Improve the quantized whisper setup. (#1018)Laurent Mazare2023-10-021-1/+1
| | | | | | | * Improve the quantized whisper setup. * Fix the config file paths. * Use the standard matmul where possible.
* Bump the version to 0.3.0. (#1014)Laurent Mazare2023-10-011-2/+2
| | | | | * Bump the version to 0.3.0. * Changelog update.
* Bump the crate versions to v0.2.3. (#886)Laurent Mazare2023-09-182-5/+5
| | | | | * Bump the crate version. * Also update the python bindings.
* Add return types to `*.pyi` stubs (#880)Lukas Kreussel2023-09-179-160/+574
| | | | | | | | | * Start generating return types * Finish tensor type hinting * Add `save_gguf` to `utils` * Typehint `quant-llama.py`
* Generate `*.pyi` stubs for PyO3 wrapper (#870)Lukas Kreussel2023-09-1615-40/+857
| | | | | | | | | | | | | | | * Begin to generate typehints. * generate correct stubs * Correctly include stubs * Add comments and typhints to static functions * ensure candle-pyo3 directory * Make `llama.rope.freq_base` optional * `fmt`
* Bump the crate version + update the changelog. (#822)Laurent Mazare2023-09-121-2/+2
|
* Add a python function to save as safetensors. (#740)Laurent Mazare2023-09-041-1/+13
|
* Return the metadata in the gguf pyo3 bindings. (#729)Laurent Mazare2023-09-042-8/+72
| | | | | | | * Return the metadata in the gguf pyo3 bindings. * Read the metadata in the quantized llama example. * Get inference to work on gguf files.
* Handle arbitrary shapes in Tensor::new. (#718)Laurent Mazare2023-09-021-5/+20
|
* Recommend using maturin. (#717)Laurent Mazare2023-09-023-37/+4
|
* More quantized llama in python. (#716)Laurent Mazare2023-09-022-11/+64
| | | | | | | | | * More quantized llama in python. * Expose a couple more functions. * Apply the last layer. * Use the vocab from the ggml files.
* Sketch a quantized llama using the pyo3 api. (#715)Laurent Mazare2023-09-023-6/+277
| | | | | | | | | | | * Sketch a quantized llama using the pyo3 api. * Add more ops. * Expose a few more functions to use in the quantized model. * Rope embeddings. * Get the forward pass to work.
* Add some quantized functions to pyo3. (#708)Laurent Mazare2023-09-011-1/+44
|
* Support for quantized tensors in the python api. (#706)Laurent Mazare2023-09-012-4/+161
| | | | | | | | | | | | | | | * Add more pyo3 support. * Add some support for quantized tensors in pyo3. * Add an arc layer on qmatmul. * Add the quantized matmul. * Quantization support. * More quantization support. * Test the python quantization.
* Cleanup the pyo3 setup. (#705)Laurent Mazare2023-09-011-0/+1
|
* Add some documentation. (#673)Laurent Mazare2023-08-301-1/+1
| | | | | * Add some documentation. * Bump the crate version.
* Bump the crate version + update CHANGELOG. (#628)Laurent Mazare2023-08-271-1/+1
|
* Fixes for clippy 1.72. (#587)Laurent Mazare2023-08-241-0/+1
|
* Add some group parameter to convolutions. (#566)Laurent Mazare2023-08-231-1/+1
| | | | | | | | | | | | | * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.
* Add support for i64 (#563)Laurent Mazare2023-08-231-0/+6
| | | | | * Add the i64 dtype. * Adapt the cuda kernels.