| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Some first `Module` implementations
* Add `state_dict` and `load_state_dict` functionality
* Move modules around and create `candle.nn.Linear`
* Add `nn.Embedding` and `nn.LayerNorm`
* Add BERT implementation
* Batch q-matmul
* Automatically dequantize `QTensors` if a `Tensor` is expected
* Add Module `.to()`, `.cuda()`, `cpu()` and `.type()` functionality
* Unittests for `Module`, `Tensor` and `candle.utils`
* Add `pytorch` like slicing to `Tensor`
* Cleanup and BERT fixes
* `black` formatting + unit-test for `nn.Linear`
* Refactor slicing implementation
|
|
|
|
|
|
|
|
|
| |
* Start generating return types
* Finish tensor type hinting
* Add `save_gguf` to `utils`
* Typehint `quant-llama.py`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Begin to generate typehints.
* generate correct stubs
* Correctly include stubs
* Add comments and typhints to static functions
* ensure candle-pyo3 directory
* Make `llama.rope.freq_base` optional
* `fmt`
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add more pyo3 support.
* Add some support for quantized tensors in pyo3.
* Add an arc layer on qmatmul.
* Add the quantized matmul.
* Quantization support.
* More quantization support.
* Test the python quantization.
|
|
|
|
|
| |
* Better handling of dtypes in pyo3.
* More pyo3 dtype.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|