index
:
forks/candle.git
main
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
candle-examples
/
examples
/
quantized
Commit message (
Expand
)
Author
Age
Files
Lines
*
Add the SmolLM2 models. (#2595)
Laurent Mazare
2024-11-03
1
-1
/
+24
*
Force the revision for the phi3-llama quantized models. (#2159)
Laurent Mazare
2024-05-04
1
-2
/
+11
*
Add a toggle for F16/BF16 accumulation in gemm. (#2141)
Laurent Mazare
2024-04-29
1
-0
/
+3
*
Add the phi-v3 quantized model. (#2118)
Laurent Mazare
2024-04-24
1
-24
/
+35
*
Add support for llama3 on the quantized example (#2086)
Thomas Santerre
2024-04-18
1
-8
/
+23
*
Include topk sampling in the quantized example. (#2005)
Laurent Mazare
2024-04-04
1
-7
/
+19
*
Switch the default to using the faster kernels. (#1978)
Laurent Mazare
2024-04-01
1
-3
/
+3
*
More ggml cuda kernels (#1977)
Laurent Mazare
2024-04-01
1
-0
/
+8
*
Add a flag to force running the quantized model on CPUs. (#1778)
Laurent Mazare
2024-02-28
1
-1
/
+5
*
Add an option to split the prompt. (#1766)
Laurent Mazare
2024-02-27
1
-1
/
+14
*
Quantized GGUF style (#1523)
Nicolas Patry
2024-01-17
1
-7
/
+9
*
Support mistral instruct v0.2. (#1475)
Laurent Mazare
2023-12-23
1
-4
/
+15
*
Mixtral quantized instruct. (#1447)
Laurent Mazare
2023-12-16
1
-0
/
+11
*
Update the readme to mention mixtral. (#1443)
Laurent Mazare
2023-12-15
1
-0
/
+13
*
Quantized mixtral model (#1442)
Laurent Mazare
2023-12-15
1
-1
/
+12
*
Add the leo models to the quantized examples. (#1398)
Laurent Mazare
2023-12-03
1
-31
/
+46
*
Add quantized Starling, fix open-chat prompt (#1393)
Lucas de Ávila Martins
2023-12-02
1
-6
/
+36
*
Fix OpenChat 3.5 tokenizer (#1347)
Lucas de Ávila Martins
2023-11-19
1
-1
/
+3
*
Add OpenChat 3.5 to quantized examples (#1346)
Lucas de Ávila Martins
2023-11-19
1
-7
/
+39
*
Fix quantized zephyr chat prompt (#1314) (#1317)
Michael Leandersson
2023-11-11
1
-2
/
+7
*
Quantized model small tweaks (#1290)
Laurent Mazare
2023-11-07
1
-39
/
+54
*
Adds check for 7b-zephyr and uses correct template (#1283)
DTJ11235
2023-11-06
1
-3
/
+6
*
Add support for Zephyr-7b in the quantized model. (#1124)
Laurent Mazare
2023-10-18
1
-2
/
+12
*
Fix the prompt for mistral when using instruct/interactive mode. (#1013)
Laurent Mazare
2023-10-01
1
-12
/
+31
*
Integrate TheBloke quantized mistral weights. (#1012)
Laurent Mazare
2023-09-30
1
-2
/
+26
*
Add a gif to the quantized readme. (#833)
Laurent Mazare
2023-09-13
2
-0
/
+2
*
Add more example readmes. (#828)
Laurent Mazare
2023-09-12
1
-1
/
+1
*
Implement top_p / nucleus sampling (#819)
Juarez Bochi
2023-09-12
1
-1
/
+5
*
Add a small readme for the quantized example. (#823)
Laurent Mazare
2023-09-12
1
-0
/
+35
*
Move more models to candle-transformers (#796)
Laurent Mazare
2023-09-10
2
-372
/
+1
*
Tweak some quantized args (#692)
Laurent Mazare
2023-08-31
1
-5
/
+14
*
Interactive mode for the quantized model. (#690)
Laurent Mazare
2023-08-31
2
-55
/
+109
*
Neon optimized vecdot (#666)
Laurent Mazare
2023-08-29
2
-364
/
+371
*
Remove some dead-code annotations. (#629)
Laurent Mazare
2023-08-27
1
-11
/
+0
*
Add some optional repeat penalty. (#623)
Laurent Mazare
2023-08-27
1
-17
/
+5
*
Generic implementation of vecdot for q80. (#596)
Laurent Mazare
2023-08-25
1
-5
/
+23
*
Get the rms epsilon from GGUF. (#565)
Laurent Mazare
2023-08-23
1
-8
/
+10
*
Fix the quantized example. (#564)
Laurent Mazare
2023-08-23
1
-2
/
+2
*
add chat models in quantized example (#551)
cksac
2023-08-23
1
-0
/
+18
*
GGUF support in the quantized model. (#559)
Laurent Mazare
2023-08-23
1
-45
/
+143
*
GQA support in the quantized model. (#555)
Laurent Mazare
2023-08-22
1
-5
/
+31
*
Add some llama-v2 variants. (#545)
Laurent Mazare
2023-08-22
1
-3
/
+22
*
Add some optional repeat penalty. (#535)
Laurent Mazare
2023-08-21
1
-0
/
+33
*
Add a yolo-v3 example. (#528)
Laurent Mazare
2023-08-20
1
-0
/
+6
*
Line up the llama.cpp implementation with the candle one. (#518)
Laurent Mazare
2023-08-19
1
-40
/
+78
*
Add a simple Module trait and implement it for the various nn layers (#500)
Laurent Mazare
2023-08-18
1
-1
/
+1
*
Q6K quantization (#495)
Laurent Mazare
2023-08-17
1
-0
/
+8
*
Add the whisper small model. (#490)
Laurent Mazare
2023-08-17
1
-1
/
+1
*
Add a verbose-prompt mode, similar to llama.cpp. (#489)
Laurent Mazare
2023-08-17
1
-5
/
+13
*
Layer norm tweaks (#482)
Laurent Mazare
2023-08-17
1
-18
/
+4
[next]