forks/candle.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Implement the module trait directly for QMatMul. (#1372)	Laurent Mazare	2023-11-25	1	-6/+5
\|
*	Quantized version of mistral. (#1009)	Laurent Mazare	2023-09-30	1	-9/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.
*	Use yoke to provide a self-referential container for mmaped safetenso… (#939)	Laurent Mazare	2023-09-23	1	-2/+1
\| \| \| \| \| \| \| \| \|	* Use yoke to provide a self-referential container for mmaped safetensor files. * Add the new self-owned type for safetensor files without removing the previous version. * Add routing. * Add an initializer for the case of multiple files.
*	Use the proper block size for quantizing models. (#933)	Laurent Mazare	2023-09-22	1	-2/+17
\| \| \| \| \|	* Use the proper block size for quantizing models. * Use the proper dimension.
*	T5 quantized example (#922)	Laurent Mazare	2023-09-21	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Load gguf files for the quantized t5. * Add the quantized t5 example. * Allow for loading local files. * Add some support for quantizing safetensor files. * Transpose before quantizing. * Quantized t5. * Retrieve the weights from the hub.
*	Add a custom softmax implementation. (#744)	Laurent Mazare	2023-09-05	1	-166/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add a custom softmax implementation. * Add softmaxlastdim to the benchmarks. * And add a test. * Support more dtypes. * Polish the code. * Use the slow implementation on cuda. * Add a todo for the cuda kernel.
*	Dilated convolutions (#657)	Laurent Mazare	2023-08-29	3	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add the dilation parameter. * Restore the basic optimizer example. * Dilation support in cudnn. * Use the dilation parameter in the cpu backend. * More dilation support. * No support for dilation in transposed convolutions. * Add dilation to a test. * Remove a print. * Helper function.
*	Llama quantization. (#625)	Laurent Mazare	2023-08-27	1	-15/+75
\|
*	Add the quantize command. (#624)	Laurent Mazare	2023-08-27	1	-1/+75
\| \| \| \| \| \| \|	* Add the quantize command. * Bugfix for writing gguf files. * And add a comment.
*	More pickle support. (#588)	Laurent Mazare	2023-08-24	1	-1/+1
\| \| \| \| \|	* More pickle support. * Be more verbose.
*	Add to the cuda example a reproduction of the issue. (#579)	Laurent Mazare	2023-08-24	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Add to the cuda example a reproduction of the issue. * Tweak. * Add a test using non-square matrixes. * Fix the conv2d kernel. * Display the error. * And tweak the comment.
*	Add a test for conv2d with padding + bugfix the random number generation on ↵	Laurent Mazare	2023-08-24	1	-0/+3
\| \| \| \| \| \| \| \| \|	cuda. (#578) * Add a test for conv2d with padding. * Cosmetic changes. * Bugfix the rand function on the cuda backend.
*	Add some group parameter to convolutions. (#566)	Laurent Mazare	2023-08-23	3	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.
*	Handle GGUF files in tensor-tools. (#558)	Laurent Mazare	2023-08-23	1	-1/+20
\|
*	Small tweaks to tensor-tools. (#517)	Laurent Mazare	2023-08-19	1	-9/+15
\|
*	Retrieve tensor data from PyTorch files. (#516)	Laurent Mazare	2023-08-19	1	-5/+7
\|
*	Retrieve more information from PyTorch checkpoints. (#515)	Laurent Mazare	2023-08-19	1	-3/+9
\| \| \| \| \|	* Retrieve more information from PyTorch checkpoints. * Add enough support to load dino-v2 backbone weights.
*	Add ggml support to tensor-tools (#512)	Laurent Mazare	2023-08-19	1	-15/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Pickle work-in-progress. * More unpickling. * More pickling. * Proper handling of setitems. * Clippy. * Again more pickling. * Restore the example. * Add enough pickle support to get the list of tensors. * Read the data from zip files. * Retrieve the tensor shape. * Extract the size and dtype. * More storage types. * Improve the destructuring. * Also support ggml files.
*	Preliminary support for importing PyTorch weights. (#511)	Laurent Mazare	2023-08-19	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Pickle work-in-progress. * More unpickling. * More pickling. * Proper handling of setitems. * Clippy. * Again more pickling. * Restore the example. * Add enough pickle support to get the list of tensors. * Read the data from zip files. * Retrieve the tensor shape. * Extract the size and dtype. * More storage types. * Improve the destructuring.
*	Add the tensor-tools binary. (#510)	Laurent Mazare	2023-08-19	1	-0/+72
\|
*	Tensor -> QTensor conversion (#496)	Laurent Mazare	2023-08-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	* Sketch some qmatmul test. * Add the quantization function. * More testing. * Make the test smaller and faster. * Add some shape checking.
*	AVX version of the vecdot for q4_0. (#474)	Laurent Mazare	2023-08-17	1	-0/+24
\| \| \| \| \| \| \| \| \|	* AVX version of the vecdot for q4_0. * Tweak the avx bits. * Add a qmatmul benchmark. * Fix the quantized test.
*	Cudnn support (#445)	Laurent Mazare	2023-08-14	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \|	* Add a cudnn feature to be used for conv2d. * Allocate the proper workspace. * Only create a single cudnn handle per cuda device. * Proper cudnn usage. * Bugfix.
*	Add a softmax bench. (#433)	Laurent Mazare	2023-08-13	1	-1/+29
\| \| \| \| \|	* Add a softmax bench. * Add the vectorized sum reduce.
*	Add a matmul benchmark. (#429)	Laurent Mazare	2023-08-13	1	-0/+19
\|
*	More accelerate optimizations (#427)	Laurent Mazare	2023-08-13	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	* Add more tracing to the whisper example. * Support accelerate in more examples. * Use accelerate for pointwise functions. * Use accelerate for binary operations too. * Bugfix for binary operation: use the rhs before the lhs.
*	Small example for benchmarking some cpu ops (#394)	Laurent Mazare	2023-08-10	2	-24/+95
\| \| \| \| \| \| \|	* Refactor the benchmark example. * Rename the example. * Add some comments.
*	Add a conv1d benchmark based on the whisper sizes. (#377)	Laurent Mazare	2023-08-09	1	-0/+24
\| \| \| \| \|	* Add a conv1d benchmark based on the whisper sizes. * Enforce the batch-dim in conv1d.
*	Add some conv1d test + bugfix using padding. (#349)	Laurent Mazare	2023-08-08	1	-20/+6
\|
*	Support the Accelerate BLAS on macOS. (#325)	Laurent Mazare	2023-08-05	1	-0/+3
\| \| \| \| \|	* Add the accelerate feature. * Ffi tweaks.
*	Rename the candle crate to candle-core (#301)	Laurent Mazare	2023-08-02	3	-3/+3
\| \| \| \| \|	* Rename to candle-core. * More candle-core renaming.
*	Simplify Tensor::randn. (#255)	Laurent Mazare	2023-07-27	1	-0/+5
\| \| \| \| \| \| \| \| \|	* Simplify Tensor::randn. * Also switch Tensor::rand to use a generic dtype. * Support sampling for f16. * Cleanup.
*	Simplify the parameters used by sum and sum_keepdim. (#165)	Laurent Mazare	2023-07-14	2	-6/+6
\|
*	Use the same default as pytorch for sum. (#164)	Laurent Mazare	2023-07-13	2	-10/+10
\|
*	Sketch a fast cuda kernel for reduce-sum. (#109)	Laurent Mazare	2023-07-08	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	* Sketch a fast cuda kernel for reduce-sum. * Sketch the rust support code for the fast sum kernel. * More work on the fast kernel. * Add some testing ground. * A couple fixes for the fast sum kernel.
*	Add some very simple sum benchmark. (#108)	Laurent Mazare	2023-07-08	2	-34/+51
\| \| \| \| \|	* Add some very simple sum benchmark. * Rename the file.
*	Add mkl support for matrix multiply. (#86)	Laurent Mazare	2023-07-06	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix some rebase issues. * Use mkl instead. * Use mkl in bert. * Add the optional mkl feature. * Conditional compilation based on the mkl feature. * Add more mkl support.
*	Move llama in a cargo-examples directory.	laurent	2023-07-03	4	-912/+0
\|
*	Adding a bit more docs around safety.	Nicolas Patry	2023-07-03	1	-1/+1
\|
*	Move more safetensors bits to the shared module.	laurent	2023-07-03	1	-16/+8
\|
*	Move some safetensors bits in the candle-core crate.	laurent	2023-07-03	1	-31/+2
\|
*	Add a flag for custom prompt.	laurent	2023-07-01	1	-2/+7
\|
*	Early conversion for the llama weights.	laurent	2023-06-30	2	-45/+19
\|
*	Add a const to easily tweak the dtype used for llama internal computations.	laurent	2023-06-30	1	-4/+8
\|
*	Tweak the kv-cache flag.	laurent	2023-06-29	1	-4/+4
\|
*	Add a flag.	laurent	2023-06-29	1	-6/+11
\|
*	Enable the KV cache after fixing the caching length and the rope bits.	laurent	2023-06-29	1	-14/+21
\|
*	Only narrow when needed + deactivate the kv cache.	laurent	2023-06-29	1	-2/+6
\|
*	Add some KV cache to llama.	laurent	2023-06-29	1	-36/+72
\|
*	Typo.	Nicolas Patry	2023-06-29	1	-1/+1
\|