Quantized mixtral model (#1442) - forks/candle.git -

diff options

author	Laurent Mazare <laurent.mazare@gmail.com>	2023-12-15 19:16:06 -0600
committer	GitHub <noreply@github.com>	2023-12-15 19:16:06 -0600
commit	30a958e5dd6152da0d9e4cf5ce338bd2dd6a0ec4 (patch)
tree	aa505d27e9f68f0e37042ff0eca02ea0486ec537 /README.md
parent	614842b311a12ac5aba130e165763f997d8ff324 (diff)
download	candle-30a958e5dd6152da0d9e4cf5ce338bd2dd6a0ec4.tar.gz candle-30a958e5dd6152da0d9e4cf5ce338bd2dd6a0ec4.tar.bz2 candle-30a958e5dd6152da0d9e4cf5ce338bd2dd6a0ec4.zip

Quantized mixtral model (#1442)

* Add the Mixtral model. * Add more of the mixtral layers. * Add the final layers for mixtral. * Sketch the expert selection. * Add some expert routing logic. * Hopefully finish the routing logic for mixtral. * Add the mixtral example. * Fix the weight filenames. * Bugfix. * Another fix. * Yet another fix + remove the unused pragma. * Shape fix. * Support for quantized mixtral. * Support mixtral in the quantized example. * Mlp or moe type. * Fix the expert field namings. * Refactor the mlp bit. * More MoE logic. * Add the MoE quantized logic. * Fix the experts length.

Diffstat (limited to 'README.md')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: