summaryrefslogtreecommitdiff
path: root/src/passes/MemoryPacking.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Rename indexType -> addressType. NFC (#7060)Sam Clegg2024-11-071-4/+4
| | | See https://github.com/WebAssembly/memory64/pull/92
* [NFC] Standardize Super:: over super:: (#6920)Alon Zakai2024-09-101-1/+1
| | | | As the name of a class, uppercase seems better here.
* Respect the Web limitation on Table size (#6567)Alon Zakai2024-05-011-0/+1
| | | | | Without this the fuzzer can error on differences in behavior between V8 and us. Also move the limitations constants to their own header.
* MemoryPacking: Handle non-empty trapping segments (#6261)Alon Zakai2024-02-011-9/+68
| | | Followup to #6243 which handled empty ones.
* MemoryPacking: Ignore empty segments (#6243)Alon Zakai2024-01-251-0/+7
| | | | | | They might trap. Leave that for RemoveUnusedModuleElements. Fixes #6230
* Remove empty _ARRAY/_VECTOR defines (NFC) (#6182)Heejin Ahn2023-12-141-3/+0
| | | | | | | `_VECTOR` or `_ARRAY` defines in `wasm-delegations-fields.def` are supposed to be defined in terms of their non-vector/array counterparts when undefined. This removes empty `_VECTOR`/`_ARRAY` defines when including `wasm-delegations-fields.def`, while adding definitions for `DELEGATE_GET_FIELD` in case it is missing.
* Rename multimemory flag (#5890)Ashley Nelson2023-08-211-1/+1
| | | Renaming the multimemory flag in Binaryen to match its naming in LLVM.
* MemoryPacking: memory.init fixes for 64 bit (#5809)Arthur Islamov2023-07-181-12/+15
| | | | | | Fixes emscripten-core/emscripten#17485 This allows emscripten to complie code with MEMORY64 + PTHREADS by fixing using the proper pointer type in the MemoryPacking pass.
* [NFC] Track the kinds of items that names refer to in ↵Alon Zakai2023-05-051-12/+29
| | | | | | | | | | | | wasm-delegations-fields (#5690) This makes delegations-fields track Kinds. That is, rather than say a field is just a Name, we can say it is a name of kind Function. This allows users to track references to functions, tables, memories, etc., in a simple and generic way, avoiding duplicated code which we have atm. (In particular this will help wasm-merge in the future.) This also uses that functionality in two small places to show the benefits (see memory-utils.cpp and MemoryPacking.cpp).
* [NFC] Refactor each of ArrayNewSeg and ArrayInit into subclasses for ↵Alon Zakai2023-05-041-15/+9
| | | | | | | | | | | Data/Elem (#5692) ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem ArrayInit => ArrayInitData, ArrayInitElem Basically we remove the opcode and use the class type to differentiate them. This adds some code but it makes the representation simpler and more compact in memory, and it will help with #5690
* Fix MemoryPacking handling of array.init_data (#5644)Thomas Lively2023-04-071-5/+7
| | | | | | | | Do not optimize out or split segments that are referred to array.init_data instructions. Fixes a bug where segments could get optimized out, producing invalid modules. Doing the work to actually split segments used by array.init_data is left for the future. Also fix a latent UBSan failure revealed by the new test case.
* Fix a crash in MemoryPacking due to an unreachable pointer (#5623)Thomas Lively2023-04-051-1/+1
| | | | | | | | Previously, the pointer type for newly emitted instructions was determined by the type of the destination pointer on a memory.init instruction, but that did not take into account that the destination pointer may be unreachable. Properly look up the pointer type on the memory instead to fix the problem. Fixes #5620.
* Use Names instead of indices to identify segments (#5618)Thomas Lively2023-04-041-49/+38
| | | | | | | | | | All top-level Module elements are identified and referred to by Name, but for historical reasons element and data segments were referred to by index instead. Fix this inconsistency by using Names to refer to segments from expressions that use them. Also parse and print segment names like we do for other elements. The C API is partially converted to use names instead of indices, but there are still many functions that refer to data segments by index. Finishing the conversion can be done in the future once it becomes necessary.
* Support memory64 in MemoryPacking (#5605)Thomas Lively2023-03-291-2/+5
| | | | | | Fix the relevant pointer and size expressions produced by MemoryPacking to be i64s when working with 64-bit memories. Fixes #5578.
* Update MemoryPacking for array.new_data (#5229)Thomas Lively2022-11-081-26/+46
| | | | | | | | | | | | | | | | | | | * Update MemoryPacking for array.new_data The MemoryPacking pass looks at all instructions that reference memory segments to determine how they can be optimized. #5214 introduced a new instruction that references memory segments, array.new_data, but did not update MemoryPacking accordingly. This omission meant that MemoryPacking could produce invalid or misoptimized modules in the presence of array.new_data. Fix the problem by making MemoryPacking aware of array.new_data. Consider array.new_data when determining whether a segment is used and update array.new_data to reflect the new, optimized segment numberings afterward. To keep things simple, do not try to split any segment that is referred to by a array.new_data instruction. * fix * Add test explanations
* Make `Name` a pointer, length pair (#5122)Thomas Lively2022-10-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char*`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char*` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char*` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char*` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
* Refactor interaction between Pass and PassRunner (#5093)Thomas Lively2022-09-301-21/+20
| | | | | | | | | | | | | | Previously only WalkerPasses had access to the `getPassRunner` and `getPassOptions` methods. Move those methods to `Pass` so all passes can use them. As a result, the `PassRunner` passed to `Pass::run` and `Pass::runOnFunction` is no longer necessary, so remove it. Also update `Pass::create` to return a unique_ptr, which is more efficient than having it return a raw pointer only to have the `PassRunner` wrap that raw pointer in a `unique_ptr`. Delete the unused template `PassRunner::getLast()`, which looks like it was intended to enable retrieving previous analyses and has been in the code base since 2015 but is not implemented anywhere.
* [Wasm GC] Support non-nullable locals in the "1a" form (#4959)Alon Zakai2022-08-311-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An overview of this is in the README in the diff here (conveniently, it is near the top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps things easy to reason about - what validates is what is valid wasm - but there are some minor nuances as mentioned there, in particular, we ignore nameless blocks (which are commonly added by various passes; ignoring them means we can keep more locals non-nullable). The key addition here is LocalStructuralDominance which checks which local indexes have the "structural dominance" property of 1a, that is, that each get has a set in its block or an outer block that precedes it. I optimized that function quite a lot to reduce the overhead of running that logic after each pass. The overhead is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we shrink code size, so there is less work actually, and it balances out). Since we run fixups after each pass, this PR removes logic to manually call the fixup code from various places we used to call it (like eh-utils and various passes). Various passes are now marked as requiresNonNullableLocalFixups => false. That lets us skip running the fixups after them, which we normally do automatically. This helps avoid overhead. Most passes still need the fixups, though - any pass that adds a local, or a named block, or moves code around, likely does. This removes a hack in SimplifyLocals that is no longer needed. Before we worked to avoid moving a set into a try, as it might not validate. Now, we just do it and let fixups happen automatically if they need to: in the common code they probably don't, so the extra complexity seems not worth it. Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a nondefaultable local. But we have the logic to fix that up now, and opts will likely keep it non-nullable as well. Various tests end up updated here because now a local can be non-nullable - previous fixups are no longer needed. Note that this doesn't remove the gc-nn-locals feature. That has been useful for testing, and may still be useful in the future - it basically just allows nn locals in all positions (that can't read the null default value at the entry). We can consider removing it separately. Fixes #4824
* Mutli-Memories Support in IR (#4811)Ashley Nelson2022-08-171-14/+22
| | | | | | | This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction. It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format. There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
* First class Data Segments (#4733)Ashley Nelson2022-06-211-55/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Updating wasm.h/cpp for DataSegments * Updating wasm-binary.h/cpp for DataSegments * Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal * checking isPassive when copying data segments to know whether to construct the data segment with an offset or not * Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp * Updated wasm-interpreter * First look at updating Passes * Updated wasm-s-parser * Updated files in src/ir * Updating tools files * Last pass on src files before building * added visitDataSegment * Fixing build errors * Data segments need a name * fixing var name * ran clang-format * Ensuring a name on DataSegment * Ensuring more datasegments have names * Adding explicit name support * Fix fuzzing name * Outputting data name in wasm binary only if explicit * Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames * Pass on when data segment names are explicitly set * Ran auto_update_tests.py and check.py, success all around * Removed an errant semi-colon and corrected a counter. Everything still passes * Linting * Fixing processing memory names after parsed from binary * Updating the test from the last fix * Correcting error comment * Impl kripken@ comments * Impl tlively@ comments * Updated tests that remove data print when == 0 * Ran clang format * Impl tlively@ comments * Ran clang-format
* Fix MemoryPacking bug (#4579)Thomas Lively2022-04-051-2/+1
| | | | | | | | 247f4c20a1 introduced a bug that caused expressions that refer to data segments to be associated with the wrong segments in the presence of other segments that have no referring expressions at all. Fixes #4569. Fixes #4571.
* [MemoryPacking] Remove workaround for old V8 bug (#4558)Thomas Lively2022-03-311-10/+1
| | | | | | | V8 used to incorrectly parse the segment index on bulk memory instructions as a signed LEB rather than an unsigned LEB, which meant it only worked correctly with up to 63 data segments. That bug has been fixed for over two years now, so lift the maximum number of segments generated by memory packing from 63 to the specified max, 100k.
* [NFC][MemoryPacking] Avoid unnecessarily enormous memory allocations (#4556)Thomas Lively2022-03-301-26/+33
| | | | | | | | | | | The memory packing pass previously used vectors with a slot for each data segment to easily map segment indices to lists of "referrers," i.e. bulk memory expressions that referred to the segments. The parallel analysis pass to collect these referrers allocated one of these vectors for each function in a module. Unfortunately, for a particularly large module with over 200k functions and over 75k data segments, this resulted in hundreds of gigabytes of memory allocations. The vast majority of functions contain no referrers, so using a std::unordered_map makes much more sense than using a vector to perform this mapping.
* Support 64-bit data segment init-exps in Memory64 (#3593)Wouter van Oortmerssen2021-02-251-3/+8
| | | This as a consequence of https://reviews.llvm.org/D95651
* MemoryPacking: Preserve segment names (#3458)Sam Clegg2020-12-181-4/+29
| | | | | Also, avoid packing builtin llvm segments names so that segments such as `__llvm_covfun` (use by llvm-cov) are preserved in the final output.
* MemoryPacking: Properly notice zeroFilledMemory (#3306)Alon Zakai2020-11-021-9/+19
| | | We can only pack memory if we know it is zero-filled before us.
* Avoid name collisions in MemoryPacking (#3265)Alon Zakai2020-10-201-4/+3
| | | | | | | | | Such a collision can happen if we run the pass twice, and somehow it finds more to optimize. To make this easy, add a general utility for getting a unique name based on a root + a numeric suffix to avoid collisions. Fixes the second testcase in #3225
* Warn on memory segment overlaps (#3257)Alon Zakai2020-10-201-0/+2
| | | | | We may fix this eventually, but it appears to not be urgent. For now at least show a warning so toolchains have a chance to see there is something they should fix.
* Fuzz fix for MemoryPacking on trampled data (#3222)Alon Zakai2020-10-151-3/+66
| | | | | | | | | | | | | I believe originally wasm did not allow overlapping segments, that is, where one memory segment tramples the data from a previous one. But then the spec changed its mind and we allowed it. Binaryen seems to have assumed the original case, and not checked for trampling. If there is a chance of trampling, we cannot optimize out zeros - the zero may have an effect if it tramples data from a previous segment. This does not occur in practice in LLVM output, which is why this wasn't a problem so far, I think. An existing testcase hit this issue, so I split it up.
* [MemoryPacking] Emit the correct segment indices on memory.init (#3239)Thomas Lively2020-10-141-1/+6
| | | | | | | | | | | This PR fixes a bug in which the segment index of a memory.init instruction was incorrect in some circumstances. Specifically, the first segment index used in output memory.init instructions was always the index of the first segment created from splitting up the corresponding input segment. This was incorrect when the input memory.init had an offset that caused it to skip over that first emitted segment so that the first output memory.init should have referred to a subsequent output segment. Fixes #3225.
* Refactor naming convention for functions handling tuples (#3196)Max Graey2020-10-091-2/+1
| | | When there are two versions of a function, one handling tuples and the other handling non-tuple values, the previous naming convention was to have "Single" in the name of the non-tuple handling function. This PR simplifies the convention and shortens function names by making the names plural for the tuple-handling version and singular for the non-tuple-handling version.
* Initial implementation of "Memory64" proposal (#3130)Wouter van Oortmerssen2020-09-181-10/+12
| | | Also includes a lot of new spec tests that eventually need to go into the spec repo
* Refactor Host expression to MemorySize and MemoryGrow (#3137)Daniel Wirtz2020-09-171-3/+2
| | | Aligns the internal representations of `memory.size` and `memory.grow` with other more recent memory instructions by removing the legacy `Host` expression class and adding separate expression classes for `MemorySize` and `MemoryGrow`. Simplifies related APIs, but is also a breaking API change.
* Add a builder.makeConst helper template (#2971)Alon Zakai2020-07-211-11/+9
|
* Update Precompute to handle tuples (#2687)Thomas Lively2020-03-101-1/+2
| | | | | | This involves replacing `Literal::makeZero` with `Literal::makeZeroes` and `Literal::makeSingleZero` and updating `isConstantExpression` to handle constant tuples as well. Also makes `Literals` its own struct and adds convenience methods on it.
* Limit the number of passive segments to work around a Chrome bug (#2613)Thomas Lively2020-01-221-1/+9
| | | | | | Chrome is currently decoding the segment indices as signed numbers, so some ranges of indices greater than 63 do not work. As a temporary workaround, limit the number of segments produced by MemoryPacking to 63 when bulk-memory is enabled.
* Optimize passive segments in memory-packing (#2426)Thomas Lively2020-01-151-111/+608
| | | | | | | | | When memory is packed and there are passive segments, bulk memory operations that reference those segments by index need to be updated to reflect the new indices and possibly split into multiple instructions that reference multiple split segments. For some bulk-memory operations, it is necessary to introduce new globals to explicitly track the drop state of the original segments, but this PR is careful to only add globals where necessary.
* Fix memory size calculation in MemoryPacking (#2548)Heejin Ahn2019-12-201-1/+6
| | | | Because `memory.size` returns the size in number of pages, we have to multiply the size with the page size when converting `memory.init`.
* Fix trapping and dangling insts in memory packing (#2540)Heejin Ahn2019-12-191-4/+14
| | | | | | | | | | | This does two things: - Restore `visitDataDrop` handler deleted in #2529, but now we convert invalid `data.drop`s to not `unreachable` but `nop`. This conforms to the revised spec that `data.drop` on the active segment can be treated as a nop. - Make `visitMemoryInit` trap if offset or size are not equal to 0 or if the dest address is out of bounds. Otherwise drop all its argument. Fixes #2535.
* Implement 0-len/drop spec changes in bulk memory (#2529)Heejin Ahn2019-12-161-6/+0
| | | | | | | | | | | | | | | | | | | | | This implements recent bulk memory spec changes (WebAssembly/bulk-memory-operations#126) in Binaryen. Now `data.drop` is equivalent to shrinking a segment size to 0, and dropping already dropped segments or active segments (which are thought to be dropped in the beginning) is treated as a no-op. And all bounds checking is performed in advance, so partial copying/filling/initializing does not occur. I tried to implement `visitDataDrop` in the interpreter as `segment.data.clear();`, which is exactly what the revised spec says. I didn't end up doing that because this also deletes all contents from active segments, and there are cases we shouldn't do that: - `wasm-ctor-eval` shouldn't delete active segments, because it will store the changed contents back into segments - When `--fuzz-exec` is given to `wasm-opt`, it runs the module and compare the execution call results before and after transformations. But if running a module will nullify all active segments, applying any transformation to the module or re-running it does not make any sense.
* Re-land #2235 with fixes (#2245)Thomas Lively2019-07-201-4/+42
| | | | #2242 had exposed the bug that the `Trapper` pass was defining `walkFunction` when it should have been defining `doWalkFunction`.
* Revert "Remove bulk memory instructions refering to active segments (#2235)" ↵Thomas Lively2019-07-191-40/+4
| | | | | (#2244) This reverts commit 72c52ea7d4eb61b95cf8a5164947cb760fe42e9c, which was causing test failures after it merged.
* Remove bulk memory instructions refering to active segments (#2235)Thomas Lively2019-07-191-4/+40
| | | | This prevents those instructions from becoming invalid due to memory packing optimizations and is also a code size win. Fixes #2227.
* Loosen conditions on MemoryPacking (#2205)Thomas Lively2019-07-031-4/+12
| | | | Allow MemoryPacking to run when there are no passive segments, even if bulk memory is enabled.
* clang-tidy braces changes (#2075)Alon Zakai2019-05-011-1/+2
| | | Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
* Apply format changes from #2048 (#2059)Alon Zakai2019-04-261-7/+9
| | | Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
* Move features from passOptions to Module (#2001)Thomas Lively2019-04-121-1/+1
| | | | | This allows us to emit a (potentially modified) target features section and conditionally emit other sections such as the DataCount section based on the presence of features.
* Move segment merging to fit web limits into its own pass (#1980)Thomas Lively2019-04-081-35/+69
| | | | | | It was previously part of writing a binary, but changing the number of segments at such a late stage would not work in the presence of bulk memory's datacount section. Also updates the memory packing pass to respect the web's limits on the number of data segments.
* Passive segments (#1976)Thomas Lively2019-04-051-2/+2
| | | | | Adds support for the bulk memory proposal's passive segments. Uses a new (data passive ...) s-expression syntax to mark sections as passive.
* Stack IR (#1623)Alon Zakai2018-07-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format. This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented, DCE local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get. Block removal - remove any blocks with no branches, as they are valid in wasm binary format. Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR. Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes: Generate stack IR Optimize stack IR (might be worth splitting out into separate passes eventually) Print stack IR for debugging purposes Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR. Miscellaneous notes: Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways. Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present. A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above. The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed. Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid. Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.