summaryrefslogtreecommitdiff
path: root/candle-examples/examples/moondream/main.rs
Commit message (Collapse)AuthorAgeFilesLines
* onnx: fix pad, unsqueeze (#2317)shua2024-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * onnx: fix pad, unsqueeze both implementations have off-by-one errors: - Pad 'reflect' cycle for eg `dim==3` is `[0,1,2,1]` which has length of 4 (or `dim*2 - 2`) not 5 (current code `dim*2 - 1`) - Unsqueeze(-1) for tensor with `dim==3` should be 3 (ie `dim+index+1`) not 2 (ie currently `dim+index`) in addition, Pad is incorrectly calculating the starting padding. If we want to pad out 2 elements to the start, and we have this cycle of indices of length 6, then we should skip 4 elements, but currently we skip 2. A more visual representation of what's going on is below: ``` pad_start: 2 data: [a,b,c,d] indices: [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1, 0, ..] // zigzag between 0..4 actual: skip [ c d| c b a b] expected: ~ skip ~ [ c b| a b c d] ``` The values between `[` and `|` are padding and the values between `|` and `]` in the example should match the original data being padded. * Fix clippy lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Pin the revision used by moondream. (#2340)Laurent Mazare2024-07-181-7/+15
|
* Optimize copy-2d for metal. (#2024)Laurent Mazare2024-04-071-1/+1
| | | | | * Optimize copy-2d for metal. * Add a hacky stopping rule for moondream.
* Add flag to run Moondream in f16 precision (#2015)Santiago Medina2024-04-051-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Use latest release special token; Fix token/s accuracy; Use GeluPytorchTanh in VisionConfig v2 * Add flag to use f16 * Avoid breaking the quantized version on cuda. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
* Use F16 for moondream on cuda. (#2013)Laurent Mazare2024-04-041-3/+9
|
* Match Moondream's latest release (#1997)Santiago Medina2024-04-021-14/+14
| | | | | | | | | | | | | | | | | | | | | * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Use latest release special token; Fix token/s accuracy; Use GeluPytorchTanh in VisionConfig v2
* Quantized moondream implementation and BOS token (#1980)Santiago Medina2024-04-011-16/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Pass bos token at the beginning of tensor. * Quantize moondream. * Forward with image bos token. * Clippy. * Use q4_0 quantization. * Add pointers for sequence and tokens; Remove seq_len conditional
* Add options to use local files + specify a custom repo or branch. (#1973)Laurent Mazare2024-03-311-3/+25
|
* Clippy fix. (#1972)Laurent Mazare2024-03-311-1/+1
|
* Add Moondream transformer implementation and example (#1970)Santiago Medina2024-03-311-0/+245
* moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward