forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	[Wasm GC] Fix GlobalStructInference on unrefined globals (#5338)	Alon Zakai	2022-12-12	1	-2/+20
\| \| \| \| \| \| \|	If a global's type is not fully refined, then when --gsi replaces a reference with a global.get, we end up with a type that might not be good enough. For example, if the type is any then it is not a subtype of eq and we can't do ref.eq on it, which this pass requires. We also can't just do struct.get on it if it is a too-distant parent or such.
*	Add Atomics support to Multi-Memory Lowering Pass (#5339)	Ashley Nelson	2022-12-12	1	-0/+41
\| \| \| \| \|	This PR adds support for Atomic instructions in the multi-memory lowering pass. Also includes optional bounds checks per the wasm spec guidelines, (visitAtomicRMW, visitAtomicCmpxchg, visitAtomicWait, visitAtomicNotify). Note: The latter two instructions, memory.atomic.wait and memory.atomic.notify, have browser engine implementations that predate the still-in-progress threads spec. And whether or not atomic.notify should trap for out-of-bounds addresses remains an open issue. For now, this PR is using the same semantics as v8, which is to bounds check all Atomic instructions the same way and trap for out-of-bounds.
*	Add SIMD support to Multi-Memory Lowering Pass (#5336)	Ashley Nelson	2022-12-12	1	-6/+34
\| \| \|	This PR adds support for SIMD instructions in the multi-memory lowering pass. Also includes optional bounds checks per the wasm spec guidelines, (SIMDLoad, SIMDLoadSplat, SIMDLoadExtend, SIMDLoadZero, SIMDLoadStoreLane load \| store).
*	Adds bounds checks to Load/Store in Multi-Memories Lowering Pass (#5256)	Ashley Nelson	2022-12-09	3	-70/+116
\| \| \|	Per the wasm spec guidelines for Load (rule 10) & Store (rule 12), this PR adds an option for bounds checking, producing a runtime error if the instruction exceeds the bounds of the particular memory within the combined memory.
*	Use non-nullable ref.cast for non-nullable input (#5335)	Thomas Lively	2022-12-09	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \|	We switched from emitting the legacy `ref.cast_static` instruction to emitting `ref.cast null` in #5331, but that wasn't quite correct. The legacy instruction had polymorphic typing so that its output type was nullable if and only if its input type was nullable. In contrast, `ref.cast null` always has a a nullable output type. Fix our output by instead emitting non-nullable `ref.cast` if the output should be non-nullable. Parse `ref.cast` in binary and text forms as well. Since the IR can only represent the legacy polymorphic semantics, disallow unsupported casts from nullable to non-nullable references or vice versa for now.
*	Add standard versions of WasmGC casts (#5331)	Thomas Lively	2022-12-07	1	-5/+5
\| \| \| \| \| \| \|	We previously supported only the non-standard cast instructions introduced when we were experimenting with nominal types. Parse the names and opcodes of their standard counterparts and switch to emitting the standard names and opcodes. Port all of the tests to use the standard instructions, but add additional tests showing that the non-standard versions are still parsed correctly.
*	[Wasm GC] Add array support to TypeMerging (#5329)	Alon Zakai	2022-12-07	1	-15/+14
\|
*	[Wasm GC] Add TypeMerging pass (#5321)	Alon Zakai	2022-12-07	5	-0/+274
\| \| \| \| \| \| \| \|	This finds types that can be merged into their super: types that add no fields, and are not used in casts, etc. - so we might as well use the super. This complements TypeSSA, in that it can merge back the new types that TypeSSA created, if we never found a use for them. Without this, TypeSSA can bloat binary size quite a lot (I see 10-20%).
*	[Wasm GC] Add array support to TypeSSA (#5327)	Alon Zakai	2022-12-07	1	-37/+79
\| \| \|	Previously it only handled structs.
*	Fix Asyncify assertions after #5293 (#5328)	Alon Zakai	2022-12-07	1	-4/+49
\| \| \| \| \| \| \| \| \| \| \|	Followup to #5293, this fixes a small regression there regarding assertions. We do have a need to visit non-instrumented functions if we want assertions, as we assert on some things there, namely that such functions do not change the state (if they changed it, we'd need to instrument them to handle that properly). This moves that logic into a new pass. We run that pass when assertions are enabled. Test diff basically undoes part the test diff from that earlier PR for that one file.
*	Fix an Inlining bug with a name collision in a br nested in a call param (#5323)	Alon Zakai	2022-12-06	1	-5/+19
\|
*	Optimize Asyncify to not flatten/optimize unnecessarily (#5293)	Alexander Guryanov	2022-12-06	1	-2/+50
\| \| \| \| \| \| \| \| \|	Add a way to proxy passes and the addition of passes in pass runners. With that we can make Asyncify only modify functions it actually needs to. On a project that Asyncify only needs to modify a few functions on, this can save a huge amount of time as it avoids flattening+optimizing the majority of the module. Fixes #4822
*	[Wasm GC] Add TypeSSA pass (#5299)	Alon Zakai	2022-12-02	4	-0/+280
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This creates new nominal types for each (interesting) struct.new. That then allows type-based optimizations to be more precise, as those optimizations will track separate info for each struct.new, in effect. That is kind of like SSA, however, we do not handle merges. For example: x = struct.new $A (5); print(x.value); y = struct.new $A (11); print(y.value); // => // x = struct.new $A.x (5); print(x.value); y = struct.new $A.y (11); print(y.value); After the pass runs each of those struct.new creates a unique type, and type-based analysis can see that 5 or 11 are the only values written in that type (if nothing else writes there). This bloats the type section with the new subtypes, so it is best used with a pass to merge unneeded duplicate types, which a later PR will add. That later PR will exactly merge back in the types created here, which are nominally different but indistinguishable otherwise. This pass is not enabled by default. It's not clear yet where is the best place to do it, as it must be balanced by type merging, but it might be better to do multiple rounds of optimization between the two. Needs more investigation.
*	Use C++17's [[maybe_unused]]. NFC (#5309)	Sam Clegg	2022-12-02	3	-14/+7
\|
*	Do not special case ref.null in `LUBFinder` (#5307)	Thomas Lively	2022-12-01	5	-38/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Before we implemented bottom heap types, `ref.null` had to be annotated with specific types. The `LUBFinder` utility ignored these types so that it could find the best LUB from all considered non-null expressions, then go back and update the type annotations on the nulls to match that LUB. Now that we have bottom types, however, none of that is necessary, and in fact ignoring nulls can miss possible refinements to bottom types. Update and simplify `LUBFinder` so that it is a simple wrapper around the underlying `Type::getLeastUpperBound` utility with no additional logic. Update tests to account for the more powerful optimizations.
*	[Wasm GC] Implement closed-world flag (#5303)	Alon Zakai	2022-11-30	2	-13/+36
\| \| \| \| \| \| \| \| \| \| \| \| \|	With this change we default to an open world, that is, we do the safe thing by default: we no longer assume a closed world. Users that want a closed world must pass --closed-world. Atm we just do not run passes that assume a closed world. (We might later refine them to find which types don't escape and only optimize those.) The RemoveUnusedModuleElements is an exception in that the closed-world flag influences one part of its operation, but not the rest. Fixes #5292
*	[NFC] Avoid unneeded work in GTO (#5304)	Alon Zakai	2022-11-30	1	-1/+3
\| \| \| \| \|	As noticed in #5303, the test changes here are because we did unnecessary work which created a new rec group, which then led to a rec group being printed out.
*	[NFC] Add a TODO in ReorderFunctions (#5205)	Alon Zakai	2022-11-30	1	-0/+2
\|
*	Update comment in OptimizeAddedConstants.cpp (#5283)	Alon Zakai	2022-11-29	1	-2/+2
\|
*	Fix validation and inlining bugs (#5301)	Thomas Lively	2022-11-29	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Inlining had a bug where it gave return_calls in inlined callees concrete types even when they should have remained unreachable. This bug flew under the radar because validation had a bug where it allowed expressions to have concrete types when they should have been unreachable. The fuzzer found this bug by adding another pass after inlining where the unexpected types caused an assertion failure. Fix the bugs and add a test that would have triggered the inlining bug. Unfortunately the test would have also passed before this change due to the validation bug, but it's better than nothing. Fixes #5294.
*	[NFC] Print type names in BINARYEN_PRINT_FULL mode (#5296)	Alon Zakai	2022-11-29	1	-3/+28
\|
*	Remove equirecursive typing (#5240)	Thomas Lively	2022-11-23	9	-38/+2
\| \| \| \|	Equirecursive is no longer standards track and its implementation is extremely complex. Remove it.
*	Change the default type system to isorecursive (#5239)	Thomas Lively	2022-11-23	1	-9/+6
\| \| \| \| \| \| \| \| \| \|	This makes Binaryen's default type system match the WasmGC spec. Update the way type definitions without supertypes are printed to reduce the output diff for MVP tests that do not involve WasmGC. Also port some type-builder.cpp tests from test/example to test/gtest since they needed to be rewritten to work with isorecursive type anyway. A follow-on PR will remove equirecursive types completely.
*	Dump only the binary in pass-debug mode (#5290)	Alon Zakai	2022-11-22	1	-4/+3
\| \| \| \| \|	Dumping the text is nice sometimes, but on huge testcases the wat can be 1 GB in size (!), and so dumping one per pass can lead to using 20 GB or so for the full optimization pipeline. Emit just the binary to avoid that.
*	[Wasm GC] Fix CoalesceLocals on tees that receive a refined type (#5289)	Alon Zakai	2022-11-22	1	-3/+22
\| \| \|	Same testcase as in #5287 but in another pass.
*	Rename UserSection -> CustomSection. NFC (#5288)	Sam Clegg	2022-11-22	2	-11/+11
\| \| \|	This reflects that naming used in the spec.
*	Code Pushing: Ignore unreachable sets (#5284)	Alon Zakai	2022-11-21	1	-0/+20
\| \| \| \| \|	Normally we ignore them anyhow (unreachability is an effect, either a trap or a control flow switch), but in traps-never-happen mode we can ignore a trap, so we need to check this manually.
*	Add post-emscripten-side-module pass argument (#5274)	Sam Clegg	2022-11-18	1	-3/+10
\| \| \| \| \| \| \|	In this mode we don't remove the start/stop_em_asm symbols or data. This is because with side modules we read this information directly from the wasm binaryen at runtime. See https://github.com/emscripten-core/emscripten/pull/18228
*	Add `hasArgument` helper to pass options. NFC (#5278)	Sam Clegg	2022-11-17	3	-15/+8
\|
*	Fix inverted logic bug with asyncify-ignore-indirect (#5275)	Sam Clegg	2022-11-17	1	-4/+4
\|
*	[Wasm GC] Start an OptimizeCasts pass and reuse cast values there (#5263)	Alon Zakai	2022-11-17	5	-0/+237
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(some.operation (ref.cast .. (local.get $ref)) (local.get $ref) ) => (some.operation (local.tee $temp (ref.cast .. (local.get $ref)) ) (local.get $temp) ) This can help cases where we cast for some reason but happen to not use the cast value in all places. This occurs in j2wasm in itable calls sometimes: The this pointer is is refined, but the itable may be done with an unrefined pointer, which is less optimizable. So far this is just inside basic blocks, but that is enough for the cast of itable calls and other common patterns I see.
*	GlobalStructInference: Handle the case of just 1 value (#5259)	Alon Zakai	2022-11-15	1	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \|	#5253 handled the case of just one possible global. It is also possible we have multiple globals but just one value. This handles that case. (It slightly overlaps with other passes, but as this pass actually identifies the creations of the objects in globals, it has a guarantee of success that the others don't, and it is very easy to just do given all the work done to handle the case of 2 values). Also fix a minor bug in #5253 - we need to trap if the old reference were null. That is, we know the reference must point to the only object ever created of that type, but that is only if it is not null; if it's null we need to trap.
*	Switch from `typedef` to `using` in C++ code. NFC (#5258)	Sam Clegg	2022-11-15	12	-12/+12
\| \| \| \|	This is more modern and (IMHO) easier to read than that old C typedef syntax.
*	GlobalStructInference: Handle cases with just 1 global too (#5253)	Alon Zakai	2022-11-15	1	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Expand GlobalStructInference to operate on cases with a single possible global, and not just 2 or more. Even the case of a single global is useful, it turns out, as we can alter the reference in places like this: (struct.get $type 0 (..ref..) ) No matter what ref is, if there is a single global it must refer to, we can switch to this: (struct.get $type 0 (global.get $global) ) That can unlock further opts later. Note that we can do this even if we don't know what the value actually is - we may not know what the struct.get returns, but we do know what it reads from.
*	Add a pass to lower sign-ext operations to MVP (#5254)	Alon Zakai	2022-11-15	4	-0/+80
\| \| \| \|	Fixes #5250
*	Fix a trivial CodePushing bug with looking at the wrong index (#5252)	Alon Zakai	2022-11-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Pretty simple logic bug, but it ended up causing us to not optimize sometimes. Sadly the original tests happened to not have anything that depended on the index in isolation. Fix + add comprehensive tests for using that index properly. Also test the call.without.effects intrinsic, which is orthoginal to this, but also worth testing as it is a big use case here.
*	Fix arithmetic in interpretation of ArrayNewSeg (#5251)	Thomas Lively	2022-11-14	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	The offset and size were previously being sign extended from 32 to 64 bits, which meant that negative sizes could make the bounds check pass and cause an exception to be thrown by an overly large allocation. Switch to using uint64_t from the start rather than mixing sizes and signs, and update the tests to reproduce the error more robustly in the absence of the fix. Also fix a bug in RemoveUnusedModuleElements triggered by the new test. Fixes #5249.
*	[Wasm GC] Add Monomorphize pass (#5238)	Alon Zakai	2022-11-11	4	-0/+254
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Monomorphization finds cases where we send more refined types to a function than it declares. In such cases we can copy the function and refine the parameters: // B is a subtype of A foo(new B()); function foo(x : A) { ..} => foo_B(new B()); // call redirected to refined copy function foo(x : A) { ..} // unchanged function foo_B(x : B) { ..} // refined copy This increases code size so it may not be worth it in all cases. This initial PR is hopefully enough to start experimenting with this on performance, and so it does not enable the pass by default. This adds two variations of monomorphization, one that always does it, and the default which is "careful": it sees whether monomorphizing lets the refined function actually be better than the original (say, by removing a cast). If there is no improvement then we do not make any changes. This saves a significant amount of code size - on j2wasm the careful version increases by 13% instead of 20% - but it does run more slowly obviously.
*	Handles memory.grow failure in MultiMemoryLowering Pass (#5241)	Ashley Nelson	2022-11-11	1	-4/+9
\| \| \|	Per the wasm spec, memory.grow instructions should return -1 when there is a failure to allocate enough memory. This PR adds support for returning this error code.
*	Fix a fuzz bug with incremental unreachability in OptimizeInstructions (#5237)	Alon Zakai	2022-11-09	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	OptimizeInstructions in rare cases can add unreachability. We propagate it out at the end all at once. The fuzzer was smart enough to find a very special combination of code + passes that can hit an issue, see the testcase. As mentioned in the TODO, we should perhaps avoid adding unreachability in OptimizeInstructions at all. If this happens again that might be worth the effort. But also checking the type of the child as in this PR doesn't add much complexity in the code.
*	Update MemoryPacking for array.new_data (#5229)	Thomas Lively	2022-11-08	1	-26/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Update MemoryPacking for array.new_data The MemoryPacking pass looks at all instructions that reference memory segments to determine how they can be optimized. #5214 introduced a new instruction that references memory segments, array.new_data, but did not update MemoryPacking accordingly. This omission meant that MemoryPacking could produce invalid or misoptimized modules in the presence of array.new_data. Fix the problem by making MemoryPacking aware of array.new_data. Consider array.new_data when determining whether a segment is used and update array.new_data to reflect the new, optimized segment numberings afterward. To keep things simple, do not try to split any segment that is referred to by a array.new_data instruction. * fix * Add test explanations
*	Add arguments to control which imports/exports are JSPI'd. (#5217)	Brendan Dahl	2022-11-08	1	-6/+54
\| \| \| \| \| \| \| \| \| \|	Instead of automatically determining which exports will be async they will be explicitly set by the user. We'll rely on the runtime trapping if they are incorrectly set. Two new arguments that behave similar to asyncify-imports: - jspi-imports - jspi-exports
*	Implement `array.new_data` and `array.new_elem` (#5214)	Thomas Lively	2022-11-07	2	-1/+41
\| \| \| \| \| \| \| \| \|	In order to test them, fix the binary and text parsers to accept passive data segments even if a module has no memory. In addition to parsing and emitting the new instructions, also implement their validation and interpretation. Test the interpretation directly with wasm-shell tests adapted from the upstream spec tests. Running the upstream spec tests directly would require fixing too many bugs in the legacy text parser, so it will have to wait for the new text parser to be ready.
*	Multi-Memories Asyncify (#5222)	Ashley Nelson	2022-11-07	1	-41/+78
\| \| \|	Adds support for the Asyncify pass to use Multi-Memories. This is specified by passing flag --asyncify-in-secondary-memory. Another flag, --asyncify-secondary-memory-size, is used to specify the initial and max size of the secondary memory.
*	[Wasm GC] RSE: Switch local.get to use a more refined type when possible (#5216)	Alon Zakai	2022-11-04	1	-28/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to #5194 but for RedundantSetElimination. This has similar benefits in terms of using a more refined local in hopes of avoiding casts in followup opts, but unlike SimplifyLocals this will operate across basic blocks. To do this, we need to track not just local.set but also local.get in that pass. Then in each basic block we can track the equivalent locals and pick from them. I see a few dozen casts removed in the J2Wasm binary. Often stuff like this happens: y = cast(x); if (..) { foo(x); // this could use y }
*	[Wasm GC] Fix GUFA on externalize/internalize (#5220)	Alon Zakai	2022-11-04	1	-1/+1
\| \| \| \| \| \| \|	These operations emit a completely different type than their input, so they must be marked as roots, and not as things that flow values through them (because then we filter everything out as the types are not compatible). Fixes #5219
*	RedundantSetElimination: Look at fallthrough values (#5213)	Alon Zakai	2022-11-03	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \|	This can help in rare cases in MVP wasm, say for the return value of a block. But for wasm GC it is very important due to casts. Similar logic was added as part of #5194 for SimplifyLocals. It should probably have been in a separate PR then. This does the right thing for RedundantSetElimination, as a separate PR. Full tests will appear in that later PR (it is not really possible to test the GC side yet - we need the logic in the later PR that actually switches to a more refined local index when available).
*	SimplifyLocals: Fix handling of subtyping (#5210)	Alon Zakai	2022-11-02	1	-16/+20
\| \| \| \| \| \| \|	We just checked if the new type we prefer (when switching a local to a more refined one in #5194) is different than the old type. But that check at the end must check it is a subtype as well. Diff without whitespace is smaller.
*	ReorderGlobals pass (#4904)	Alon Zakai	2022-11-02	4	-1/+177
\| \| \| \| \| \| \| \| \|	This sorts globals by their usage (and respecting dependencies). If the module has very many globals then using smaller LEBs can matter. If there are fewer than 128 globals then we cannot reduce size, and the pass exits early (so this pass will not slow down MVP builds, which usually have just 1 global, the stack pointer). But with wasm GC it is common to use globals for vtables etc., and often there is a very large number of them.
*	[Wasm GC] SimplifyLocals: Switch local.get to use a more refined type when ↵	Alon Zakai	2022-11-01	1	-16/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	possible (#5194) (local.set $refined (cast (local.get $plain))) .. .. (local.get $plain) .. ;; we can change this to read from $refined By using the more refined type we may be able to eliminate casts later. To do this, look at the fallthrough value (so we can look through a cast or a block value - this is the reason for the small wasm2js improvements in tests), and also extend the code that picks which local index to read to look at types (previously we just ignored any pairs of locals with different types).