diff options
author | ivarflakstad <69173633+ivarflakstad@users.noreply.github.com> | 2024-03-07 09:42:34 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-03-07 09:42:34 +0100 |
commit | 0c09d10f320df7c23fc231f7f400967b03d9b9da (patch) | |
tree | 194297a74d4f1cf695a32a9bee6c518e607adfe1 /candle-nn | |
parent | 8a99cf7dd2e0d2ff9cb18232272dad380d887f2d (diff) | |
download | candle-0c09d10f320df7c23fc231f7f400967b03d9b9da.tar.gz candle-0c09d10f320df7c23fc231f7f400967b03d9b9da.tar.bz2 candle-0c09d10f320df7c23fc231f7f400967b03d9b9da.zip |
Improve metal buffer usage (#1807)
* Improve metal buffer usage
* Clone cpu storage when loading to reduce wait_until_complete calls
* Use powers of two for buffer sizes so reuse is more likely.
* Select best available buffer by size.
* Add count to MetalStorage -> can use buffer with different size
Co-authored-by: Chris Fleetwood <christopher.fleetwood@huggingface.co>
* Simplify new buffer creation without blit copy. Revert &[] -> Vec
* Add documentation on newBufferWithBytes safety / synchronization
* Drop unused buffers after command buffer is done syncing.
---------
Co-authored-by: Chris Fleetwood <christopher.fleetwood@huggingface.co>
Diffstat (limited to 'candle-nn')
-rw-r--r-- | candle-nn/src/ops.rs | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/candle-nn/src/ops.rs b/candle-nn/src/ops.rs index aaec8b56..fdd67142 100644 --- a/candle-nn/src/ops.rs +++ b/candle-nn/src/ops.rs @@ -238,7 +238,8 @@ impl candle::CustomOp1 for SoftmaxLastDim { &output, ) .unwrap(); - let newstorage = candle::MetalStorage::new(output, device.clone(), storage.dtype()); + let newstorage = + candle::MetalStorage::new(output, device.clone(), elem_count, storage.dtype()); Ok((newstorage, layout.shape().clone())) } } |