forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix opt/shrink levels when running the optimizer multiple times, Part 2 (#5787)	Alon Zakai	2023-06-27	2	-8/+45
\| \| \| \| \| \| \| \| \| \| \|	This is a followup to #5333 . That fixed the selection of which passes to run, but forgot to also fix the global state of the current optimize/shrink levels. This PR fixes that. As a result, running -O3 -Oz will now work as expected: the first -O3 will run the right passes (as #5333 fixed) and while running them, the global optimize/shrinkLevels will be -O3 (and not -Oz), which this PR fixes. A specific result of this is that -O3 -Oz used to inline less, since the invocation of inlining during -O3 thought we were optimizing for size. The new test verifies that we do fully inline in the first -O3 now.
*	PostEmscripten: Preserve __em_js__ exports in side modules (#5780)	Sam Clegg	2023-06-23	2	-1/+49
\|
*	[EH] Add pass to remove EH instructions (#5770)	Heejin Ahn	2023-06-15	1	-0/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass strips all EH stuff, including EH instructions and tags, from the input module and disables the EH feature from the features section. 1. This removes `catch` and `catch_all` blocks from the code. So ```wast (try (do (some code) ) (catch ... ) ) ``` becomes just `(some code)`. Note that all `rethrow`s will be removed with `catch`es. Note that all `rethrow`s will be removed with `catch`es. 2. This converts 'throw (...)` into `unreachable`. Note that `rethrows 3. This removes all tags from the module, which are unused anyway after 1 and 2. 4. This removes exception handling feature from the features section. You can use the pass with ```console $ wasm-opt --enable-exception-handling --strip-eh INPUT -o OUTPUT ``` This is not an optimization pass, so it is not run unless you specify the pass explicitly. This is in effect similar to Clang's `-fignore-exceptions`, in which you can throw but it will result in a crash and we compile away all landing pads. This can be used for people who don't (or can't) use `-fignore-exceptions` in their build settings or who want to compile away `catch` blocks later. Closes emscripten-core/emscripten#19585.
*	EffectAnalyzer: Assume we execute the two things whose effects we compare ↵	Alon Zakai	2023-06-13	2	-0/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#5764) EffectAnalyzer::canReorder/invalidate now assume that the things from whom we generated the effects both execute (or, rather, that if the first of them doesn't transfer control flow then they execute). If they both execute then we can do more work in TrapsNeverHappen mode, since we can then reorder this for example: (global.set ..) (i32.load ..) The load may trap, but in TNH mode we assume it won't. So we can reorder those two. However, if they did not both execute then we could be in this situation: (global.set ..) (br_if ..) (i32.load) Reordering the load and the set here would be invalid, because we could make the load execute when it didn't execute before, and it could now start to actually trap at runtime. This new assumption seems obvious, since we don't compare the effects of things unless they are adjacent and with no control flow between them - otherwise, why compare them? To be sure, I manually reviewed every single use of EffectAnalyzer::canReorder/invalidate in the entire codebase. I've also been fuzzing this for several days (hundreds of thousands of iterations), and have not seen any problem. This was motivated by seeing that #5744 should be able to do more work in TNH mode, but it wasn't. New tests show the benefits there in OptimizeCasts as well as in SimplifyLocals.
*	DeadArgumentElimination: Do not error on bottom types in result refining (#5763)	Alon Zakai	2023-06-12	1	-0/+36
\| \| \| \|	More generally, the LUB computation that code relies on did not handle bottom types properly.
*	ConstantFieldPropagation: Track copied values properly (#5761)	Alon Zakai	2023-06-12	1	-0/+70
\| \| \| \|	The logic ignored copied values, which was fine for struct.get operations but not for struct.new.
*	Update br_on_cast binary and text format (#5762)	Thomas Lively	2023-06-12	7	-36/+36
\| \| \| \| \| \| \| \| \| \| \| \|	The final versions of the br_on_cast and br_on_cast_fail instructions have two reference type annotations: one for the input type and one for the cast target type. In the binary format, this is represented as a flags byte followed by two encoded heap types. Upgrade all of the tests at once to use the new versions of the instructions and drop support for the old instructions from the text parser. Keep support in the binary parser to avoid breaking users, though. Drop some binary tests of deprecated instruction encodings that would be more effort to update than they're worth. Re-land with fixes of #5734
*	TypeRefining: Fix a bug with chains of StructGets (#5757)	Alon Zakai	2023-06-08	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \|	If we have (struct.get $A (struct.get $B then if both types end up refined we may have a problem. If the inner one is refined to emit nullref then the outer one no longer knows what type it is, since it depends on the type of the ref child for that in our IR. We can't just skip updating it, as the outside may depend on its new refined type to validate. To avoid errors here, just make this code that is effectively unreachable also actually unreachable.
*	Move casts which are immediate children of local.gets to earlier local.gets ↵	Bruce He	2023-06-06	1	-7/+842
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#5744) In the OptimizeCasts pass, it is useful to move more refined casts as early as possible without causing side-effects. This will allow such casts to potentially trap earlier, and will allow the OptimizeCasts pass to use more refined casts earlier. This change allows a more refined cast to be duplicated at an earlier local.get expression. The later instance of the cast will then be eliminated in a later optimization pass. For example, if we have the following instructions: (drop (local.get $x) ) (drop (ref.cast $A (local.get $x) ) (drop (ref.cast $B (local.get $x) ) ) Where $B is a sublcass of $A, we can convert this to: (drop (ref.cast $B (local.get $x) ) ) (drop (ref.cast $A (local.get $x) ) (drop (ref.cast $B (local.get $x) ) ) Concretely we will save the first cast to a local and use it in the other local.gets.
*	StackIR: Remove nops (#5746)	Alon Zakai	2023-05-30	2	-6/+8
\| \| \| \| \| \| \|	No nop instruction is necessary in wasm, so in StackIR we can simply remove them all. Fixes #5745
*	Revert "Update br_on_cast binary and text format (#5734)" (#5740)	Alon Zakai	2023-05-23	7	-36/+36
\| \| \| \| \| \| \|	This reverts commit b7b1d0df29df14634d2c680d1d2c351b624b4fbb. See comment at the end of #5734: It turns out that dropping the old opcodes causes problems for current users, so let's revert this for now, and later we can figure out how best to do the update.
*	TypeSSA: Handle collisions by adding a hash to ensure a fresh rec group (#5724)	Alon Zakai	2023-05-19	1	-0/+33
\| \| \|	Fixes #5720
*	Update br_on_cast binary and text format (#5734)	Thomas Lively	2023-05-19	7	-36/+36
\| \| \| \| \| \| \| \| \| \|	The final versions of the br_on_cast and br_on_cast_fail instructions have two reference type annotations: one for the input type and one for the cast target type. In the binary format, this is represented as a flags byte followed by two encoded heap types. Since these instructions have been in flux for a while, do not attempt to maintain backward compatibility with older versions of the instructions. Instead, upgrade all of the tests at once to use the new versions of the instructions. Drop some binary tests of deprecated instruction encodings that would be more effort to update than they're worth.
*	Vacuum code leading up to a trap in TrapsNeverHappen mode (#5228)	Alon Zakai	2023-05-17	2	-2/+436
\| \| \| \| \| \| \| \| \| \| \| \|	This adds two rules to vacuum in TNH mode: if (..) trap() => if (..) {} { stuff, trap() } => {} That is, we assume traps never happen so an if will not branch to one, and code right before a trap can be assumed to not execute. Together, we should be removing practically all possible code in TNH mode (though we could also add support for br_if etc.).
*	Print function types on function imports in the text format (#5727)	Alon Zakai	2023-05-17	33	-115/+115
\| \| \| \|	The function type should be printed there just like for non-imported functions.
*	EffectAnalyzer: Do not clear break targets before walk()/visit() (#5723)	Alon Zakai	2023-05-17	1	-0/+41
\| \| \| \| \| \|	We depend on repeated calls to walk/visit accumulating effects, so this was a bug; if we want to clear stuff then we create a new EffectAnalyzer. Removing that fixes the attached testcase. Also added a unit test.
*	[Strings] Adopt new instruction binary encoding (#5714)	Jérôme Vouillon	2023-05-12	1	-4/+64
\| \| \| \| \| \| \| \| \| \| \|	See WebAssembly/stringref#46. This format is already adopted by V8: https://chromium-review.googlesource.com/c/v8/v8/+/3892695. The text format is left unchanged (see #5607 for a discussion on the subject). I have also added support for string.encode_lossy_utf8 and string.encode_lossy_utf8 array (by allowing the replace policy for Binaryen's string.encode_wtf8 instruction).
*	Extend drop.h and use it in Directize (#5713)	Alon Zakai	2023-05-10	1	-75/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds an option to ignore effects in the parent in getDroppedChildrenAndAppend. With that, this becomes usable in more places, like Directize, basically in situations where we know we can ignore effects in the parent (since we've inferred they are not needed). This lets us get rid of some boilerplate code in Directize. Diff without whitespace is a lot smaller. A large other part of the diff is a rename of curr => parent which I think it makes it more readable as then parent/children is a clear contrast, and then the new parameter "ignore/ notice parent effects" is obviously connected to "parent". The top comment in drop.cpp is removed as it just duplicated the top comment in the header drop.h. This is basically NFC but using drop.h does bring the advantage of emitting less code, see the test changes, so it is noticeable in the IR. This is a refactoring PR in preparation for a larger improvement to Directize that will also benefit from this new drop capability.
*	Gate all partial inlining behind the partial-inlining-ifs flag (#5710)	Alon Zakai	2023-05-10	1	-0/+207
\| \| \| \| \| \| \| \| \|	#4191 meant to do that, I think, but only did so for "pattern B". This does it for all patterns, and adds assertions. In theory this could regress code that benefits from partial inlining of "pattern A" (since this PR stops doing it by default), but I did not see a significant difference on any benchmarks, and it is easy to re-enable that behavior by doing --partial-inlining-ifs=1.
*	Add a "mayNotReturn" effect (#5711)	Alon Zakai	2023-05-10	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This changes loops from having the effect "may trap (timeout)" to having "may not return." The only noticeable difference is in TrapsNeverHappen mode, which ignores the former but not the latter. So after this PR, in TNH mode we do not optimize away an infinite loop that seems to have no other side effects. We may also use this for other things in the future, like continuations/stack switching. There are upsides and downsides to allowing the optimizer to remove infinite loops (the C and C++ communities have had interesting discussions on that topic over the years...) but it seems safer to not optimize them out for now, to let the most code work properly. If a need comes up to optimize such code, we can look at various options then (like a flag to ignore infinite loops). See discussion in #5228
*	Remove TypeUpdater in Vacuum and always ReFinalize (#5707)	Alon Zakai	2023-05-09	2	-5/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TypeUpdater::remove must be called after removing a thing from the tree. If not, then we can get confused by something like this: (block $b (br $b) ) If we first call TypeUpdater::remove then we see that the block's only br is going away, so it becomes unreachable. But when we then remove the br then the block should have type none. Removing the br first from the IR, and then calling TypeUpdater::remove, is the safe way to do it. However, changing that order in Vacuum is not trivial. After looking into this, I realized that it is just simpler to remove TypeUpdater entirely. Instead, we can ReFinalize at the end unconditionally. This has the downside that we do not propagate type updates "as we go", but that should be very rare. Another downside is that TypeUpdater tracks the number of brs, which can help remove code like in the one test that regresses here (see comment there). But I'm not sure that removal was valid - Vacuum should not really be doing it, and it looks like its related to this bug actually. Instead, we have a dedicated pass for removing unused brs - RemoveUnusedBrs - so leave things for it. This PR's benefit, aside from now handling the fuzz testcase, is that it makes the code simpler and faster. I see a 10-25% speedup on the Vacuum pass on various binaries I tested on. (Vacuum is one of our faster passes anyhow, though, so the runtime of -O1 is not much improved.) Another minor benefit might be that running ReFinalize more often can propagate type info more quickly, thanks to #5704 etc. But that is probably very minor.
*	Port vacuum_all-features test to lit (#5708)	Thomas Lively	2023-05-09	1	-0/+1300
\| \| \|	Do the port automatically using the port_passes_tests_to_lit.py script.
*	Fix optimizeAddedConstants on GC-introduced unreachability (#5706)	Alon Zakai	2023-05-09	1	-0/+44
\|
*	Generate unique block names when inlining (#5697)	Alon Zakai	2023-05-05	8	-121/+121
\| \| \| \| \| \| \| \| \| \| \|	Each time we inline we put the contents in a block. Before we used the same name each time we inlined the same method, and as a result had many conflicts if a function was inlined many times. With this PR we emit a different name each time. This is not 100% NFC as it does change block names, which is observable in the IR (as can be seen in the test updates). This helps #5696 in speeding up UniqueNameManner.
*	[Wasm GC] Automatically make RefCast heap types more precise (#5704)	Alon Zakai	2023-05-05	7	-16/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already did this for nullablilty, and so for the same reasons we should do it for heap types as well. Also, I realized that doing so would solve #5703, which is the new test added for TypeRefining here. The fuzz bug solved here is that our analysis of struct gets/sets will skip copy operations - a read from a field that is written into it. And we skip fallthrough values while doing so, since it doesn't matter if the read goes through an if arm or a cast. An if would automatically get a more precise type during refinalize, so this PR does the same for a cast basically. Fixes #5703
*	Fix DeadArgumentElimination return value opts on nesting+recursion (#5701)	Alon Zakai	2023-05-04	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The testcase here has a recursive call that is also nested in itself, something like (return (call $me (return .. This found a bug in our return value removal logic. When we remove a return value we both modify call sites (to add drops) and modify returns (to remove their values). One of those uses pointers into the IR which the other invalidated, so the order of the two matters. This PR just reorders that code to fix the bug.
*	Fallback to direct inlining if the outline will be inlined. (#5698)	Goktug Gokdogan	2023-05-04	1	-36/+535
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This workarounds the extra work around the edge case where; - Function is too big to full-inline - It is a candidate for partial inline - Outlined version becomes eligible for full-inline. In such a case, binaryen would introduce a temporary state with partial inlined functions and later on inline them. J2CL hit this scenario for String literal which resulted in significant regressions in compilation time. This patch updates partial inlining analysis to identify the edge case and direct to full-inlining when that happens.
*	[Wasm GC] Always refinalize in SignatureRefining (#5694)	Alon Zakai	2023-05-01	1	-0/+38
\| \| \| \|	We used to refine only for result changes, but param changes can also lead to opportunities.
*	[Wasm GC] ReFinalize when needed in SimplifyGlobals (#5682)	Alon Zakai	2023-04-20	1	-0/+33
\|
*	[Wasm GC] Fix a trapsNeverHappen corner case with if/select of a trapping ↵	Alon Zakai	2023-04-20	1	-23/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	arm (#5681) The logic says that if an if/select has an arm that returns a null type, and the if/select goes into a cast, then we can ignore that arm in tnh mode (as it would trap, and we are ignoring the possibility of a trap). But it is not enough to return a null type - the null must actually flow out, rather than say a return be executed before. One existing test needed adjustment, as it used calls for "thing with effects". But a call can transfer control flow when EH is enabled, and this pass has -all. Rather than mess with the features, I switched the effects to be locals.
*	Disable the memory64 feature in Memory64Lowering.cpp (#5679)	Thomas Lively	2023-04-19	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Disable sign extension in SignExtLowering.cpp The sign extension lowering pass would previously lower away the sign extension instructions, but it wouldn't disable the sign extension feature, so follow-on passes such as optimize-instructions could reintroduce sign extension instructions. Fix the pass to disable the sign extension feature to prevent sign extension instructions from being reintroduced later. * update pass description * Disable the memory64 feature in Memory64Lowering.cpp For consistency with other feature lowering passes, disable memory64 in addition to lowering its use away. Although no other passes would introduce new uses of memory64 at the moment, this makes the lowering pass more robust against a future where memory64 might accidentally be reintroduced after being lowered away. * Update test/lit/passes/memory64-lowering-features.wast Co-authored-by: Alon Zakai <azakai@google.com> --------- Co-authored-by: Alon Zakai <azakai@google.com>
*	Disable sign extension in SignExtLowering.cpp (#5676)	Thomas Lively	2023-04-19	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Disable sign extension in SignExtLowering.cpp The sign extension lowering pass would previously lower away the sign extension instructions, but it wouldn't disable the sign extension feature, so follow-on passes such as optimize-instructions could reintroduce sign extension instructions. Fix the pass to disable the sign extension feature to prevent sign extension instructions from being reintroduced later. * update pass description
*	[Wasm GC] Fix GUFA on array.init of a bottom type (#5675)	Alon Zakai	2023-04-19	1	-0/+29
\|
*	[Wasm GC] OptimizeInstructions: Don't turn ref.test into unreachable ↵	Alon Zakai	2023-04-17	1	-1/+34
\| \| \| \| \| \| \| \| \| \| \| \|	immediately (#5673) Emit an unreachable, but guarded by a block as we do in other cases in this pass, to avoid having unreachable code that is not fully propagated during the pass (as we only do a full refinalize at the end). See existing comments starting with "Make sure to emit a block with the same type as us" in the pass. This is mostly not a problem with other casts, but ref.test returns an i32 which we have lots of code that tries to optimize.
*	[Wasm GC] Fix SignatureRefining on a call_ref to a bottom type (#5670)	Alon Zakai	2023-04-17	1	-0/+20
\| \| \| \| \| \| \|	Before this PR we hit the assert on the type not being basic. We could also look into fixing the caller to skip bottom types, but as bottom types trivially have no subtypes, it is more future-facing to simply handle it.
*	Remove the --hybrid and --nominal command line options (#5669)	Thomas Lively	2023-04-14	3	-3/+3
\| \| \| \| \|	After this change, the only type system usable from the tools will be the standard isorecursive type system. The nominal type system is still usable via the API, but it will be removed entirely in a follow-on PR.
*	Port the remaining test/lit/passes tests off of --nominal (#5668)	Thomas Lively	2023-04-14	13	-315/+409
\|
*	Port a few more tests off of --nominal (#5666)	Thomas Lively	2023-04-14	8	-203/+271
\|
*	Remove --nominal from more tests (#5664)	Thomas Lively	2023-04-13	16	-4023/+83
\| \| \| \|	These tests were easy to remove --nominal from because they already worked with the standard type system as well.
*	Convert some tests off of --nominal (#5660)	Thomas Lively	2023-04-13	2	-126/+151
\| \| \| \| \| \| \| \| \| \| \| \|	In preparation to remove the nominal type system, which is nonstandard and not usable for modules with nontrivial external linkage requirements, port an initial batch of tests to use the standard isorecursive type system. The port involves reordering input types to ensure that supertypes precede their subtypes and inserting rec groups to ensure that structurally identical types maintain their separate identities. More tests will be ported in future PRs before the nominal type system is removed entirely.
*	[Wasm GC] Casts of a non-nullable bottom type to non-null fail (#5645)	Alon Zakai	2023-04-12	2	-16/+157
\| \| \| \| \| \| \| \| \| \| \|	Casting (ref nofunc) to (ref func) seems like it can succeed based on the rule of "if it's a subtype, it can cast ok." But the fuzzer found a corner case where that leads to a validation error (see testcase). Refactor the cast evaluation logic to handle uninhabitable refs directly, and return Unreachable for them (since the cast cannot even be reached). Also reorder the rule checks there to always check for a non-nullable cast of a bottom type (which always fails).
*	Add a name hint to getValidName() (#5653)	Alon Zakai	2023-04-11	6	-72/+72
\| \| \| \| \| \| \|	Without the hint, we always look for a valid name using name$0, $1, $2, etc., starting from 0, and in some cases that can lead to quadratic behavior. Noticed on a testcase in the fuzzer that runs for over 24 seconds (I gave up at that point) but takes only 2 seconds with this.
*	[GUFA] Fix packed field filtering (#5652)	Alon Zakai	2023-04-10	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Technically we need to filter both before and after combining, that is, if a location's contents will be filtered by F() then if the new contents are x and old contents y then we need to end up with F(F(x) U F(y)). That is, filtering before is necessary to ensure the union of content does not end up unnecessarily large, and the filtering after is necessary to ensure the final result is properly filtered to fit. (If our representation were perfect then this would not be needed, but it is not, as the union of two exact types can end up as a very large cone, for example.) For efficiency we have been filtering afterwards. But that is not enough for packed fields, it turns out, where we must filter before. If we don't, then if inputs 0 and 0x100 arrive to an i8 field then combining them we get "unknown integer" (which is then filtered by 0xff, but it's too late). By filtering before, the actual values are both 0 and we end up with that as the only possible value. It turns out that filtering before is enough for such fields, so do only that.
*	[Wasm GC] Update struct.get types in TypeRefining (#5649)	Alon Zakai	2023-04-10	1	-0/+49
\| \| \| \| \|	We depended on ReFinalize doing it for us, and that usually works, but there is a corner case that depends on knowing all the type changes being done. So use our complete information to update those types in the pass.
*	Fix MemoryPacking handling of array.init_data (#5644)	Thomas Lively	2023-04-07	1	-0/+28
\| \| \| \| \| \| \| \|	Do not optimize out or split segments that are referred to array.init_data instructions. Fixes a bug where segments could get optimized out, producing invalid modules. Doing the work to actually split segments used by array.init_data is left for the future. Also fix a latent UBSan failure revealed by the new test case.
*	Fix and simplify refinalization in OptimizeInstructions (#5642)	Alon Zakai	2023-04-07	1	-0/+89
\| \| \| \| \|	The fuzzer found another case we were missing. I realized that we can just check for this in replaceCurrent, at least for places that call that method, which is the common case. So this simplifies the code while fixing a bug.
*	[Wasm GC] Handle packed fields in GUFA and CFP (#5640)	Alon Zakai	2023-04-07	2	-0/+146
\| \| \| \| \|	The same bug was present in both: We ignored packing, so writing a larger value than fits in the field would lead to us propagating that original value.
*	[Wasm GC] Fix an assertion in array.set processing in OptimizeInstructions ↵	Alon Zakai	2023-04-07	1	-4/+57
\| \| \| \|	(#5641)
*	[Wasm GC] Fix GUFA on ArrayInit (#5638)	Alon Zakai	2023-04-07	1	-0/+59
\|
*	[Wasm GC] OptimizeInstructions: ref.as_non_null of null will trap (#5635)	Alon Zakai	2023-04-06	2	-112/+110
\| \| \| \| \| \| \|	Trivial peephole optimization. Some work was needed in the tests as some of them relied on that pattern for convenience, so I modified them to try to keep them testing the same thing as much as possible (for one, struct.set.null.fallthrough, I don't think we can actually keep testing the same, as the situation should not be possible any more).