forks/candle.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Explicit caching in llama2.c.	laurent	2024-02-22	2	-20/+21
\|
*	Use the tokenizer-output-stream in the llama example. (#1715)	Laurent Mazare	2024-02-15	1	-7/+6
\| \| \| \| \|	* Use the tokenizer-output-stream in the llama example. * Also use tokenizer-output-stream for llama2-c.
*	Quantized GGUF style (#1523)	Nicolas Patry	2024-01-17	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Metal quantized modifications proposal. - Add a device param, wherever needed. - Create new QMetal storage thing that implements QuantizedType. - Update everywhere needed. Fix Python. Fixing examples. Fix: fmt + clippy + stub. Moving everything around. Only missing the actual implems. Fixing everything + adding dequantized kernels. More work. Fixing matmul. Fmt + Clippy Some clippy fixes. Working state. Q2K Metal -> Bugged (also present in GGML). Q4K CPU -> Bugged (present previously, new test catch it). Q5K CPU -> Bugged (present previously). Q8_1 Both -> Never really implemented it seems Q8K metal -> Never implemented in metal Fixing Q2K bug (present in ggml). * Cleanup. * Fix the rebase. * Removing the fences speeds everything up and is correct this time... * Cleanup the fence. * After rebase. * Bad code removal. * Rebase after phi2 merge + fix replit default to CPU. * Making the CI happy. * More happy tests. --------- Co-authored-by: Nicolas Patry <nicolas@Nicolass-MacBook-Pro.local>
*	Infer the config for llama2-c. (#1208)	Laurent Mazare	2023-10-28	2	-3/+13
\|
*	Move the llama2-c model in transformers. (#1205)	Laurent Mazare	2023-10-28	4	-712/+3
\|
*	Add a quantized variant of llama2.c (#1197)	Laurent Mazare	2023-10-27	3	-10/+285
\| \| \| \| \|	* Add a quantized variant of llama2.c * Clippy fixes.
*	Implement top_p / nucleus sampling (#819)	Juarez Bochi	2023-09-12	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Implement top_p / nucleus sampling * Update changelog * rustfmt * Add tests * Fix clippy warning * Fix another clippy error
*	Add a repeat penality to the llama2-c command line example. (#713)	Laurent Mazare	2023-09-01	1	-0/+18
\| \| \| \| \|	* Add a repeat penality to the llama2-c command line example. * Another fix attempt.
*	Add the optimizer trait. (#702)	Laurent Mazare	2023-09-01	1	-0/+1
\|
*	Add a simple Module trait and implement it for the various nn layers (#500)	Laurent Mazare	2023-08-18	1	-1/+1
\| \| \| \| \| \| \|	* Start adding the module trait. * Use the module trait. * Implement module for qmatmul.
*	Add an abstract type for RmsNorm. (#499)	Laurent Mazare	2023-08-18	1	-5/+5
\|
*	Layer norm tweaks (#482)	Laurent Mazare	2023-08-17	1	-34/+8
\| \| \| \| \| \| \|	* Add some options to make layer-norm more configurable. * Add the rms-norm variant. * Replace the RmsNorm with the shared bits.
*	Support the Accelerate BLAS on macOS. (#325)	Laurent Mazare	2023-08-05	1	-0/+3
\| \| \| \| \|	* Add the accelerate feature. * Ffi tweaks.
*	Add the candle-datasets crate (#322)	Laurent Mazare	2023-08-05	2	-119/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Move the vision datasets to a separate crate. * Move the batcher bits. * Update the readme. * Move the tiny-stories bits. --------- Co-authored-by: Jane Doe <jane.doe@example.org>
*	Transpose the weight matrixes for llama2.c. (#321)	Laurent Mazare	2023-08-04	1	-8/+15
\|
*	Support safetensors weights in llama2.c inference. (#317)	Laurent Mazare	2023-08-03	2	-7/+18
\|
*	Use AdamW in the llama2 training. (#308)	Laurent Mazare	2023-08-02	1	-2/+9
\|
*	Llama more training (#297)	Laurent Mazare	2023-08-01	2	-18/+26
\| \| \| \| \| \| \| \| \| \| \|	* Rework the var-builder to handle initializations. * Add some helper functions for layer creation. * Improve the layer initializations. * Get initialized variables. * Precompute the rot embeddings when training lamas.
*	Add training for the llama2.c example (#296)	Laurent Mazare	2023-08-01	3	-7/+216
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Rework the commands and run inference by default. * Add the training module and load the training dataset. * Random dataset iterator. * Proper valid-loss computation. * Compute the evaluation loss. * Add more substance to the training loop.
*	Move the weight bits in a separate module. (#295)	Laurent Mazare	2023-08-01	3	-164/+168
\|
*	Add some batcher variants that handle errors. (#294)	Laurent Mazare	2023-08-01	1	-4/+4
\|
*	Add the batcher. (#293)	Laurent Mazare	2023-08-01	1	-18/+14
\|
*	Use subcommands in llama2. (#292)	Laurent Mazare	2023-08-01	1	-100/+90
\|
*	Pre-tokenized evaluation mode for llama2.c. (#291)	Laurent Mazare	2023-08-01	1	-30/+51
\|
*	Evaluate on the pre-tokenized file. (#290)	Laurent Mazare	2023-07-31	1	-1/+58
\|
*	Remove the end of text tokens. (#289)	Laurent Mazare	2023-07-31	1	-1/+2
\|
*	Add an eval mode to llama2-c (#288)	Laurent Mazare	2023-07-31	2	-35/+87
\| \| \| \| \| \| \|	* Add an eval mode to llama2-c. * Encode line by line. * Get the eval to run.
*	Add a prompt and support more models in llama2-c. (#285)	Laurent Mazare	2023-07-31	2	-6/+26
\| \| \| \| \|	* Support more models in llama2-c. * Add a prompt.
*	Use the hub models for llama2.c (#284)	Laurent Mazare	2023-07-31	1	-25/+37
\|
*	Use u8 tensors for masks. (#273)	Laurent Mazare	2023-07-29	1	-2/+1
\|
*	Softmax numerical stability. (#267)	Laurent Mazare	2023-07-28	1	-1/+1
\| \| \| \| \|	* Softmax numerical stability. * Fix the flash-attn test.
*	Use the binary decoder for llama2.c. (#230)	Laurent Mazare	2023-07-24	2	-65/+85
\| \| \| \| \| \| \| \| \|	* Use the binary decoder for llama2.c. * Add the temperature. * Formatting tweak. * Fix the rotary embeddings.
*	Add llama2.c as an example. (#229)	Laurent Mazare	2023-07-24	2	-0/+558
	* Start adding llama2.c. * Model loading. * Add the llama-v2 model. * Start converting the weights. * Rotary embedding tweaks. * Get the model to generate some tokens.