forks/candle.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	UniPC for diffusion sampling (#2684)	Nick Senger	2025-01-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* feat: Add unipc multistep scheduler * chore: Clippy and formatting * chore: Update comments * chore: Avoid unsafety in float ordering * refactor: Update Scheduler::step mutability requirements * fix: Corrector img2img * chore: Update unipc ref link to latest diffusers release * chore: Deduplicate float ordering * fix: Panic when running with dev profile
*	onnx: fix pad, unsqueeze (#2317)	shua	2024-07-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* onnx: fix pad, unsqueeze both implementations have off-by-one errors: - Pad 'reflect' cycle for eg `dim==3` is `[0,1,2,1]` which has length of 4 (or `dim2 - 2`) not 5 (current code `dim2 - 1`) - Unsqueeze(-1) for tensor with `dim==3` should be 3 (ie `dim+index+1`) not 2 (ie currently `dim+index`) in addition, Pad is incorrectly calculating the starting padding. If we want to pad out 2 elements to the start, and we have this cycle of indices of length 6, then we should skip 4 elements, but currently we skip 2. A more visual representation of what's going on is below: ``` pad_start: 2 data: [a,b,c,d] indices: [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1, 0, ..] // zigzag between 0..4 actual: skip [ c d\| c b a b] expected: ~ skip ~ [ c b\| a b c d] ``` The values between `[` and `\|` are padding and the values between `\|` and `]` in the example should match the original data being padded. * Fix clippy lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
*	Utilize batches in Stable Diffusion (#2071)	NorilskMajor	2024-04-16	2	-17/+60
\| \| \| \| \| \| \| \| \| \| \|	* Utilize batches in Stable Diffusion that were already there, but unutilized. Also refactor out the `save_image` function. * Clippy + cosmetic fixes. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
*	Improve the error message on overlong prompts. (#1908)	Laurent Mazare	2024-03-21	1	-0/+14
\|
*	Add a --seed argument to the stable-diffusion example. (#1812)	Niklas Hallqvist	2024-03-08	1	-0/+8
\| \| \| \| \| \| \| \| \|	* Add a --seed argument to the stable-diffusion example. * Make the case when no seed is specified, that it will not be set, but use the engine's default. This will make the CPU engine work again when no --seed is given, and will cause a bailout when a seed is there, as the engine does not currently support it. --------- Co-authored-by: niklas <niklas@appli.se>
*	Fix typo in README (#1740)	Daniel Varga	2024-02-22	1	-1/+1
\|
*	Format properly the Stable Diffusion example run with params (#1511)	stano	2024-01-01	1	-1/+1
\| \| \|	Move out the --sd-version flag out of the prompt.
*	Add more mentions to SDXL Turbo in the readme. (#1397)	Laurent Mazare	2023-12-03	1	-6/+16
\|
*	Stable Diffusion Turbo Support (#1395)	Edwin Cheng	2023-12-03	1	-31/+90
\| \| \| \| \| \| \| \| \| \| \|	* Add support for SD Turbo * Set Leading as default in euler_ancestral discrete * Use the appropriate default values for n_steps and guidance_scale. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
*	fix: address clippy 0.1.74 issues (#1336)	drbh	2023-11-16	1	-2/+2
\| \| \| \|	- clippy::needless-borrows-for-generic-args - clippy::reserve-after-initialization
*	Mention the flash-attention restriction in the readme. (#1158)	Laurent Mazare	2023-10-23	1	-0/+3
\|
*	Remove some unusued bits. (#1067)	Laurent Mazare	2023-10-09	1	-1/+0
\|
*	Override the repo for SDXL f16 vae weights. (#1064)	Laurent Mazare	2023-10-09	1	-2/+13
\| \| \| \| \|	* Override the repo for SDXL f16 vae weights. * Slightly simpler change.
*	Add the clamping for stable-diffusion. (#1041)	Laurent Mazare	2023-10-05	1	-2/+1
\|
*	Use the module trait in stable-diffusion. (#817)	Laurent Mazare	2023-09-11	1	-1/+1
\|
*	Stable-Diffusion readme (#814)	Laurent Mazare	2023-09-11	2	-0/+63
\| \| \| \| \| \| \| \| \| \| \|	* Stable Diffusion readme. * Fix the image path. * Move the assets. * Resize the sample image. * Lower resolution.
*	Move the stable-diffusion modeling code so that it's easier to re-use. (#812)	Laurent Mazare	2023-09-11	12	-3303/+1
\|
*	Fix for cudnn to work with img2img. (#753)	Laurent Mazare	2023-09-06	1	-0/+2
\|
*	img2img pipeline for stable diffusion. (#752)	Laurent Mazare	2023-09-06	2	-11/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* img2img pipeline for stable diffusion. * Rename the arguments + fix. * Fix for zero strength. * Another fix. * Another fix. * Revert. * Include the backtrace. * Noise scaling. * Fix the height/width.
*	Add a custom softmax implementation. (#744)	Laurent Mazare	2023-09-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add a custom softmax implementation. * Add softmaxlastdim to the benchmarks. * And add a test. * Support more dtypes. * Polish the code. * Use the slow implementation on cuda. * Add a todo for the cuda kernel.
*	Simplify usage of the pool functions. (#662)	Laurent Mazare	2023-08-29	1	-1/+1
\| \| \| \| \| \| \|	* Simplify usage of the pool functions. * Small tweak. * Attempt at using apply to simplify the convnet definition.
*	Dilated convolutions (#657)	Laurent Mazare	2023-08-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add the dilation parameter. * Restore the basic optimizer example. * Dilation support in cudnn. * Use the dilation parameter in the cpu backend. * More dilation support. * No support for dilation in transposed convolutions. * Add dilation to a test. * Remove a print. * Helper function.
*	Use multiple transformer layer in the same cross-attn blocks. (#653)	Laurent Mazare	2023-08-29	4	-22/+43
\| \| \| \| \|	* Use multiple transformer layer in the same cross-attn blocks. * Make the context contiguous if required.
*	Preliminary support for SDXL. (#647)	Laurent Mazare	2023-08-29	4	-57/+253
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Preliminary support for SDXL. * More SDXL support. * More SDXL. * Use the proper clip config. * Querying for existing tensors. * More robust test.
*	Remove some dead-code annotations. (#629)	Laurent Mazare	2023-08-27	4	-15/+0
\| \| \| \| \| \| \| \| \|	* Remove some dead-code annotations. * More dead code removal. * One more. * CI fix.
*	Trace softmax (#568)	Laurent Mazare	2023-08-23	1	-3/+8
\| \| \| \| \| \| \|	* Trace the softmax op. * Inline the sum. * Add min/max vec operations.
*	Add some group parameter to convolutions. (#566)	Laurent Mazare	2023-08-23	4	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.
*	Print some per-step timings in stable-diffusion. (#520)	Laurent Mazare	2023-08-20	1	-1/+4
\| \| \| \| \| \| \| \| \|	* Skeleton files for neon support of quantization. * SIMD version for q4 vecdot. * Also simdify the q6k multiplication. * Add some timings to stable-diffusion.
*	dinov2 - read images from disk and compute the class probabilities (#503)	Laurent Mazare	2023-08-18	2	-21/+2
\| \| \| \| \|	* Load the image from disk and convert it to a tensor. * Tweak the function name.
*	Add a simple Module trait and implement it for the various nn layers (#500)	Laurent Mazare	2023-08-18	7	-0/+7
\| \| \| \| \| \| \|	* Start adding the module trait. * Use the module trait. * Implement module for qmatmul.
*	F16 support for stable diffusion (#488)	Laurent Mazare	2023-08-17	6	-43/+99
\| \| \| \| \| \| \| \| \|	* F16 support for stable diffusion. * Keep the attention bits in F32. * Keep more of the attention bits in F32. * More mixed precision support.
*	Flash-attention support in stable diffusion (#487)	Laurent Mazare	2023-08-17	5	-32/+78
\| \| \| \| \| \| \| \| \| \| \|	* Add flash-attention for the stable-diffusion example. * Change the dtype. * Silly fix. * Another fix. * Revert the dtype back to the query dtype after apply flash-attn.
*	Track the conv2d operations in stable-diffusion. (#431)	Laurent Mazare	2023-08-13	6	-22/+144
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Track the conv2d operations in stable-diffusion. * Add more tracing to stable-diffusion. * Also trace the resnet bits. * Trace the attention blocks. * Also trace the attention inner part. * Small tweak.
*	Allow using accelerate with stable-diffusion. (#430)	Laurent Mazare	2023-08-13	1	-0/+3
\|
*	Stable diffusion: retrieve the model files from the HF hub. (#414)	Laurent Mazare	2023-08-11	2	-34/+71
\| \| \| \| \|	* Retrieve the model files from the HF hub in the stable diffusion example. * Add to the readme.
*	Fix the stable-diffusion vae. (#398)	Laurent Mazare	2023-08-10	2	-4/+4
\| \| \| \| \|	* Fix the stable-diffusion vae. * Fix for saving images.
*	Write the generated images using the image crate. (#363)	Laurent Mazare	2023-08-09	2	-6/+25
\| \| \| \| \|	* Use the image crate to write the generated images. * Make the dependency optional.
*	Fix the padding used in stable diffusion. (#362)	Laurent Mazare	2023-08-09	2	-5/+4
\|
*	Fixes for the stable diffusion example. (#342)	Laurent Mazare	2023-08-08	4	-6/+17
\| \| \| \| \| \| \| \| \| \| \|	* Fixes for the stable diffusion example. * Bugfix. * Another fix. * Fix for group-norm. * More fixes to get SD to work.
*	Some CLIP fixes for stable diffusion. (#338)	Laurent Mazare	2023-08-07	2	-14/+10
\| \| \| \| \|	* Some CLIP fixes for stable diffusion. * Add the avg-pool2d operation on cpu.
*	Skeleton for the avg-pool2d and upsample-nearest2d ops. (#337)	Laurent Mazare	2023-08-07	2	-15/+4
\| \| \| \| \|	* Skeleton for the avg-pool2d and upsample-nearest2d ops. * Preliminary conv2d support.
*	Simple pad support. (#336)	Laurent Mazare	2023-08-07	3	-7/+5
\| \| \| \| \|	* Simple pad support. * Fix the tensor indexing when padding.
*	Implement group-norm. (#334)	Laurent Mazare	2023-08-07	2	-6/+2
\| \| \| \| \|	* Implement group-norm. * Add some testing for group-norm.
*	Main diffusion loop for the SD example. (#332)	Laurent Mazare	2023-08-06	3	-8/+242
\|
*	Add the recip op + use it in stable-diffusion. (#331)	Laurent Mazare	2023-08-06	1	-5/+13
\| \| \| \| \| \| \|	* Add the recip unary op. * Fix the cuda kernel. * Use the recip op in sigmoid.
*	Add the ddim scheduler. (#330)	Laurent Mazare	2023-08-06	5	-0/+445
\|
*	Add a stable diffusion example (#328)	Laurent Mazare	2023-08-06	9	-0/+2560
	* Start adding a stable-diffusion example. * Proper computation of the causal mask. * Add the chunk operation. * Work in progress: port the attention module. * Add some dummy modules for conv2d and group-norm, get the attention module to compile. * Re-enable the 2d convolution. * Add the embeddings module. * Add the resnet module. * Add the unet blocks. * Add the unet. * And add the variational auto-encoder. * Use the pad function from utils.