Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | onnx: fix pad, unsqueeze (#2317) | shua | 2024-07-23 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * onnx: fix pad, unsqueeze both implementations have off-by-one errors: - Pad 'reflect' cycle for eg `dim==3` is `[0,1,2,1]` which has length of 4 (or `dim*2 - 2`) not 5 (current code `dim*2 - 1`) - Unsqueeze(-1) for tensor with `dim==3` should be 3 (ie `dim+index+1`) not 2 (ie currently `dim+index`) in addition, Pad is incorrectly calculating the starting padding. If we want to pad out 2 elements to the start, and we have this cycle of indices of length 6, then we should skip 4 elements, but currently we skip 2. A more visual representation of what's going on is below: ``` pad_start: 2 data: [a,b,c,d] indices: [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1, 0, ..] // zigzag between 0..4 actual: skip [ c d| c b a b] expected: ~ skip ~ [ c b| a b c d] ``` The values between `[` and `|` are padding and the values between `|` and `]` in the example should match the original data being padded. * Fix clippy lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com> | ||||
* | Pin the revision used by moondream. (#2340) | Laurent Mazare | 2024-07-18 | 1 | -7/+15 |
| | |||||
* | Optimize copy-2d for metal. (#2024) | Laurent Mazare | 2024-04-07 | 1 | -1/+1 |
| | | | | | * Optimize copy-2d for metal. * Add a hacky stopping rule for moondream. | ||||
* | Add flag to run Moondream in f16 precision (#2015) | Santiago Medina | 2024-04-05 | 1 | -1/+10 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Use latest release special token; Fix token/s accuracy; Use GeluPytorchTanh in VisionConfig v2 * Add flag to use f16 * Avoid breaking the quantized version on cuda. --------- Co-authored-by: laurent <laurent.mazare@gmail.com> | ||||
* | Use F16 for moondream on cuda. (#2013) | Laurent Mazare | 2024-04-04 | 1 | -3/+9 |
| | |||||
* | Match Moondream's latest release (#1997) | Santiago Medina | 2024-04-02 | 1 | -14/+14 |
| | | | | | | | | | | | | | | | | | | | | | * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Use latest release special token; Fix token/s accuracy; Use GeluPytorchTanh in VisionConfig v2 | ||||
* | Quantized moondream implementation and BOS token (#1980) | Santiago Medina | 2024-04-01 | 1 | -16/+77 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Pass bos token at the beginning of tensor. * Quantize moondream. * Forward with image bos token. * Clippy. * Use q4_0 quantization. * Add pointers for sequence and tokens; Remove seq_len conditional | ||||
* | Add options to use local files + specify a custom repo or branch. (#1973) | Laurent Mazare | 2024-03-31 | 1 | -3/+25 |
| | |||||
* | Clippy fix. (#1972) | Laurent Mazare | 2024-03-31 | 1 | -1/+1 |
| | |||||
* | Add Moondream transformer implementation and example (#1970) | Santiago Medina | 2024-03-31 | 1 | -0/+245 |
* moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward |