Quantized version of flux. (#2500)

diff options

author	Laurent Mazare <laurent.mazare@gmail.com>	2024-09-26 10:23:43 +0200
committer	GitHub <noreply@github.com>	2024-09-26 10:23:43 +0200
commit	10d47183c088ce449da13d74f07171c8106cd6dd (patch)
tree	b91b0398fcb314e998b9f7f3b23877f63462b232 /.github
parent	d01207dbf3fb0ad614e7915c8f5706fbc09902fb (diff)
download	candle-10d47183c088ce449da13d74f07171c8106cd6dd.tar.gz candle-10d47183c088ce449da13d74f07171c8106cd6dd.tar.bz2 candle-10d47183c088ce449da13d74f07171c8106cd6dd.zip

* Quantized version of flux. * More generic sampling. * Hook the quantized model. * Use the newly minted gguf file. * Fix for the quantized model. * Default to avoid the faster cuda kernels.

Diffstat (limited to '.github')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: