Add the cuda dequantize f16 kernels. (#2137)

diff options

author	Laurent Mazare <laurent.mazare@gmail.com>	2024-04-28 20:05:05 +0200
committer	GitHub <noreply@github.com>	2024-04-28 20:05:05 +0200
commit	eb26e2467eb4cb5ca507324cc3245600c104f219 (patch)
tree	7aa8fead605a786c38d0b6d2835342240e80c9a2 /candle-nn
parent	c68ed8963fb6fc842f20d84baa07ff97b56aedb4 (diff)
download	candle-eb26e2467eb4cb5ca507324cc3245600c104f219.tar.gz candle-eb26e2467eb4cb5ca507324cc3245600c104f219.tar.bz2 candle-eb26e2467eb4cb5ca507324cc3245600c104f219.zip

* Add the cuda dequantize f16 kernels. * Expose the cuda kernels. * Add some testing + fix. * Test the other cases too. * A few more tests. * Add an environment variable to enable the dequantize f16 + matmul behavior.

Diffstat (limited to 'candle-nn')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: