forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Remove legacy WasmGC instructions (#5861)	Thomas Lively	2023-08-09	16	-336/+148
\| \| \| \| \|	Remove old, experimental instructions and type encodings that will not be shipped as part of WasmGC. Updating the encodings and text format to match the final spec is left as future work.
*	LinearExecutionWalker: Optionally connect blocks for Br and BrOn (#5869)	Alon Zakai	2023-08-09	4	-23/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Br and BrOn can consider the code before and after them connected if it might be reached (which is the case if the Br has a condition, which BrOn always has). The wasm2js changes may look a little odd as some of them have this: i64toi32_i32$1 = i64toi32_i32$2; i64toi32_i32$1 = i64toi32_i32$2; I looked into that and the reason is that those outputs are not optimized, and also even in unoptimized wasm2js we do run simplify-locals once (to try to reduce the downsides of flatten). As a result, this PR makes a difference there, and that difference can lead to such odd duplicated code after other operations. However, there are no changes to optimized wasm2js outputs, so there is no actual problem. Followup to #5860.
*	OptimizeCasts: Connect adjacent blocks in LinearExecutionWalker (#5866)	Alon Zakai	2023-08-08	2	-21/+80
\| \| \| \| \| \| \|	Followup to #5860, this does the same for (part of) OptimizeCasts. As there, this is valid because it's ok if we branch away. This part of the pass picks a different local to get when it knows locals have the same values but one is more refined. It is ok to add a tee earlier even if it isn't used later.
*	LocalCSE: Connect adjacent blocks in LinearExecutionWalker (#5867)	Alon Zakai	2023-08-08	2	-4/+53
\| \| \| \| \| \| \|	Followup to #5860, this does the same for LocalCSE. As there, this is valid because it's ok if we branch away. This pass adds a local.tee of a reused value and then gets it later, and it's ok to add a tee even if we branch away and do not use it.
*	SimplifyGlobals: Connect adjacent blocks in LinearExecutionWalker (#5865)	Alon Zakai	2023-08-08	1	-0/+55
\| \| \| \| \| \| \|	Followup to #5860, this does the same for SimplifyGlobals as for SimplifyLocals. As there, this is valid because it's ok if we branch away. This part of the pass applies a global value to a global.get based on a dominating global.set, so any dominance is good enough for us.
*	LinearExecutionTraversal: Add an option to connect adjacent code, use in ↵	Alon Zakai	2023-08-08	5	-18/+116
\| \| \| \| \| \| \| \| \| \| \|	SimplifyLocals (#5860) This addresses most of the minor regression from the correctness fix in #5857. That PR makes us consider calls as branching, but in some cases it is ok to ignore that branching (see the comment in the code here), which this PR allows as an option. This undoes one test change from that PR, showing it undoes the regression for SimplifyLocals. More tests are added to cover this specifically as well.
*	Fix LinearExecutionWalker on calls (#5857)	Alon Zakai	2023-08-07	4	-2/+205
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Calls were simply not handled there, so we could think we were still in the same basic block when we were not, affecting various passes (but somehow this went unnoticed until the TNHOracle #5850 ran on some particular Java code). One existing test was affected, and two new tests are added: one for TNHOracle where I detected this, and one in OptimizeCasts which is perhaps a simpler way to see the problem. All the cases but the TNH one, however, do not need this fix for correctness since they actually don't care if a call would throw. As a TODO, we should find a way to undo this minor regression. The regression only affects builds with EH enabled, though, so most users should be unaffected even in the interm.
*	Lattice to model Stack (#5849)	Bruce He	2023-08-03	1	-34/+117
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change introduces StackLattice, a lattice to model stack-related behavior. It is templated on a separate lattice whose elements model some property of individual values on the stack. The StackLattice allows users to access the top of the stack, push abstract values, and pop them. Comparisons and least upper bound operations are done on a value by value basis starting from the top of the stack and moving toward the bottom. This is because it allows stacks from different scopes to be joined easily. An application of StackLattice is to model the wasm value stack. The goal is to organize lattice elements representing individual stack values in a natural way which mirrors the wasm value stack. Transfer functions operate on each stack value individually. The stack lattice is an intermediate structure which is not intended to be directly operated on. Rather, it simulates the push and pop behavior of instructions.
*	Fix a fuzz bug in TypeMapper (#5851)	Thomas Lively	2023-08-02	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TypeMapper is a utility used to globally rewrite types, mapping some eliminated source types into destination types they should be replaced with. This was previously done by first rewriting all the types in the IR according to the given mapping, then rewriting the type definitions and updating all the types in the IR again. Not only was doing the rewriting twice inefficient, it also introduced a subtle bug where the set of private types eligible to be rewritten could be inconsistent because updating types in the IR could change the types of control flow structures. The fuzzer found a case where this inconsistency caused the type rebuilding to fail. Fix the bug by first building the new types with the mapping applied and only then rewriting the IR a single time. Also add a `TypeBuilder::dump` utility for use in debugging. Fixes #5845.
*	GUFA: Infer using TrapsNeverHappen (#5850)	Alon Zakai	2023-08-02	3	-0/+3131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a TrapsNeverHappen oracle that is used inside the main PossibleContents oracle of GUFA. The idea is that when traps never happen we can reason "backwards" from information to things that must be true before it: temp = x.field; x.cast_to<Y>(); // Y is a subtype of x's type X Here we cast x to a subtype. If we assume traps never happen then the cast must succeed, and that means we can assume we had a Y on the previous line, where perhaps that information lets us infer the value of x.field. This PR focuses on calls, which are the more interesting situation to optimize because other passes do some work already inside functions. Specifically, we look for things that will trap in the called function or the caller, such as if the called function always casts a param to some type, we can assume the caller passes such a type in. And if we have a call_ref then any target that would trap cannot be called (at least in a closed world). This has some benefits, in particular when combined with --gufa-cast-all since that casts more things, which lets us apply the inferences made here. I see 3.3% fewer call_ref instructions on a Kotlin testcase, for example. This helps more on -Os when we inline less.
*	[Wasm GC] Stop printing deprecated cast etc. instructions (#5852)	Thomas Lively	2023-08-02	7	-10/+10
\| \| \| \| \| \| \| \|	Stop printing `ref.as_i31`, `br_on_func`, etc. because they have been removed from the spec and are no longer supported by V8. #5614 already made this change for the binary format. Like that PR, leave reading unmodified in case someone is still using these instructions (even though they are useless). They will be fully removed in a future PR as we finalize things ahead of standardizing WasmGC.
*	Fix binary writing of strings without GC enabled (#5836)	Alon Zakai	2023-07-31	1	-0/+12
\|
*	GUFA: Add a version that casts all of our inferences (#5846)	Alon Zakai	2023-07-27	3	-0/+154
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GUFA refines existing casts, but does not add new casts for fear of increasing code size and adding more cast operations at runtime. This PR adds a version that does add all those casts, and it looks like at least code size improves rather than regresses, at least on J2Wasm and Kotlin. That is, this pass adds a lot more casts, but subsequent optimizations benefit enough to shrink overall code size. However, this may still not be worthwhile, as even if code size decreases we may end up doing more casts at runtime, and those casts might be hard to remove, e.g.: (call $foo (x) ;; inferred to be non-null ) (func $foo (param (ref null $A) => (call $foo (ref.cast $A (x) ;; add a cast here ) (func $foo (param (ref $A) ;; later pass refines here That new cast cannot be removed after we refine the function parameter. If the function never benefits from the fact that the input is non-null, then the cast is wasted work (e.g. if the function only compares the input to another value). To use this new pass, try --gufa-cast-all rather than --gufa. As with normal GUFA, running the full optimizer afterwards is important, and even more important in order to get rid of as many of the new casts as possible.
*	[NFC] Port passes remove-unused-brs_all-features.wast to lit (#5843)	Thomas Lively	2023-07-27	1	-0/+225
\| \| \| \|	Port the test automatically using the port_passes_tests_to_lit.py script. As a drive-by, fix a typo in the script as well.
*	Fix a crash in TypeRefining on bottom types (#5842)	Alon Zakai	2023-07-27	1	-1/+43
\| \| \|	Followup to #5840
*	Add a Fuzzer for Lattice and Transfer Function Properties (#5831)	Bruce He	2023-07-26	1	-0/+1305
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds a fuzzer which checks the following properties in abstract interpretation static analyses. - Transfer Function Monotonicity - Lattice Element Reflexivity - Lattice Element Transitivity - Lattice Element Anti-Symmetry This is done by randomly generating a module and using its functions as transfer function inputs, along with randomly generated lattice elements (states). Lattice element properties are fuzzed from the randomly generated states also.
*	TypeRefining: Add casts when we must (#5840)	Alon Zakai	2023-07-26	1	-0/+109
\| \| \| \| \| \| \| \| \|	See the example in the code and test for a situation that requires this for validation. To fix validation we add a cast. That should practically always be removed by later optimizations, and the fact it took the fuzzer this long to even find such a situation also adds confidence that this won't be adding overhead (and in this situation, the optimizer will definitely remove the cast).
*	wasm-merge: Fix locals in merged start (#5837)	Alon Zakai	2023-07-26	3	-3/+31
\| \| \| \| \| \| \| \| \|	Start functions can have locals, which we previously ignored as we just concatenated the bodies together. This makes us copy the second start and call that, keeping them separate (the optimizer can then inline, if that makes sense). Fixes #5835
*	SimplifyLocals: Refinalize after removing redundant tees (#5830)	Alon Zakai	2023-07-21	1	-0/+24
\|
*	MemoryPacking: memory.init fixes for 64 bit (#5809)	Arthur Islamov	2023-07-18	1	-0/+38
\| \| \| \| \| \|	Fixes emscripten-core/emscripten#17485 This allows emscripten to complie code with MEMORY64 + PTHREADS by fixing using the proper pointer type in the MemoryPacking pass.
*	wasm-merge: Error on import loops (#5820)	Alon Zakai	2023-07-17	1	-10/+4
\|
*	[wasm-merge] Handle chains of import/export (#5813)	Jérôme Vouillon	2023-07-17	5	-0/+52
\| \| \| \| \| \| \|	When a module item is imported and directly reexported by an intermediate module, we need to perform several name lookups and use its name in the initial module rather than the intermediate name when fusing imports and exports.
*	Add a pass to sort functions by name (#5811)	Alon Zakai	2023-07-12	3	-0/+113
\|
*	GUFA: Refine casts (#5805)	Alon Zakai	2023-07-07	2	-7/+48
\| \| \| \| \| \| \|	If we see (ref.cast $A) but we have inferred that a more refined type will be present there at runtime $B then we can refine the cast to (ref.cast $B). We could do the same even when a cast is not present, but that would increase code size. This optimization keeps code size constant.
*	Initial support for `final` types (#5803)	Thomas Lively	2023-07-06	4	-64/+138
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement support in the type system for final types, which are not allowed to have any subtypes. Final types are syntactically different from similar non-final types, so type canonicalization is made aware of finality. Similarly, TypeMerging and TypeSSA are updated to work correctly in the presence of final types as well. Implement binary and text parsing and emitting of final types. Use the standard text format to represent final types and interpret the non-standard "struct_subtype" and friends as non-final. This allows a graceful upgrade path for users currently using the non-standard text format, where they can update their code to use final types correctly at the point when they update to use the standard format. Once users have migrated to using the fully expanded standard text format, we can update update Binaryen's parsers to interpret the MVP shorthands as final types to match the spec without breaking those users. To make it safe for V8 to independently start interpreting types declared without `sub` as final, also reserve that shorthand encoding only for types that have no strict subtypes.
*	[DeadArgumentElimination] Optimize function body after parameter refinement ↵	Jérôme Vouillon	2023-07-06	1	-0/+42
\| \| \| \| \| \|	(#5799) It can be useful to optimize a function body after its parameters are refined, like we do for other parameter changes.
*	[NFC] Fix check lines in nominal-good.wast (#5802)	Thomas Lively	2023-07-06	1	-31/+29
\| \| \| \|	Delete old, unused "NOMINAL" check lines and replace the sole remaining check prefix, "HYBRID", with the standard "CHECK".
*	Print supertype declarations using the standard format (#5801)	Thomas Lively	2023-07-06	34	-259/+259
\| \| \| \| \| \|	Use the standard "(sub $super ...)" format instead of the non-standard "XXX_supertype ... $super" format. In a follow-on PR implementing final types, this will allow us to print and parse the standard text format for final types right away with a smaller diff.
*	Heap2Local: Add a test for params (#5798)	Alon Zakai	2023-07-05	1	-0/+87
\| \| \| \|	This already worked (thanks to LocalGraph integration), but add an explicit test to verify that just to be sure.
*	OptimizeInstructions: Loop on fallthrough values in RefTest (#5797)	Alon Zakai	2023-07-05	1	-0/+55
\| \| \| \| \|	This parallels the code in RefCast. Previously we only looked at the type reaching us, but intermediate fallthrough values can let us optimize too. In particular, we were not optimizing (ref.test (local.tee ..)) if the tee was to a less-refined type.
*	Limit printing of Literal[s] in a general way (#5792)	Alon Zakai	2023-06-28	1	-0/+43
\| \| \| \| \| \| \| \|	Previously we limited printing in a single Literals. But we can have infinitely recursive GC literals, or just huge graphs even without infinite recursion where no single Literals is that big (but we still get exponential blowup). This PR adds a general limit on how much we print once we start to print a Literal or Literals.
*	Fix opt/shrink levels when running the optimizer multiple times, Part 2 (#5787)	Alon Zakai	2023-06-27	2	-8/+45
\| \| \| \| \| \| \| \| \| \| \|	This is a followup to #5333 . That fixed the selection of which passes to run, but forgot to also fix the global state of the current optimize/shrink levels. This PR fixes that. As a result, running -O3 -Oz will now work as expected: the first -O3 will run the right passes (as #5333 fixed) and while running them, the global optimize/shrinkLevels will be -O3 (and not -Oz), which this PR fixes. A specific result of this is that -O3 -Oz used to inline less, since the invocation of inlining during -O3 thought we were optimizing for size. The new test verifies that we do fully inline in the first -O3 now.
*	PostEmscripten: Preserve __em_js__ exports in side modules (#5780)	Sam Clegg	2023-06-23	2	-1/+49
\|
*	Fix pop assertion (#5777)	Alon Zakai	2023-06-20	1	-0/+28
\| \| \|	Subtypes are allowed as well, not just exact matches, in the pop value's type.
*	[EH] Add pass to remove EH instructions (#5770)	Heejin Ahn	2023-06-15	3	-0/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass strips all EH stuff, including EH instructions and tags, from the input module and disables the EH feature from the features section. 1. This removes `catch` and `catch_all` blocks from the code. So ```wast (try (do (some code) ) (catch ... ) ) ``` becomes just `(some code)`. Note that all `rethrow`s will be removed with `catch`es. Note that all `rethrow`s will be removed with `catch`es. 2. This converts 'throw (...)` into `unreachable`. Note that `rethrows 3. This removes all tags from the module, which are unused anyway after 1 and 2. 4. This removes exception handling feature from the features section. You can use the pass with ```console $ wasm-opt --enable-exception-handling --strip-eh INPUT -o OUTPUT ``` This is not an optimization pass, so it is not run unless you specify the pass explicitly. This is in effect similar to Clang's `-fignore-exceptions`, in which you can throw but it will result in a crash and we compile away all landing pads. This can be used for people who don't (or can't) use `-fignore-exceptions` in their build settings or who want to compile away `catch` blocks later. Closes emscripten-core/emscripten#19585.
*	EffectAnalyzer: Assume we execute the two things whose effects we compare ↵	Alon Zakai	2023-06-13	2	-0/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#5764) EffectAnalyzer::canReorder/invalidate now assume that the things from whom we generated the effects both execute (or, rather, that if the first of them doesn't transfer control flow then they execute). If they both execute then we can do more work in TrapsNeverHappen mode, since we can then reorder this for example: (global.set ..) (i32.load ..) The load may trap, but in TNH mode we assume it won't. So we can reorder those two. However, if they did not both execute then we could be in this situation: (global.set ..) (br_if ..) (i32.load) Reordering the load and the set here would be invalid, because we could make the load execute when it didn't execute before, and it could now start to actually trap at runtime. This new assumption seems obvious, since we don't compare the effects of things unless they are adjacent and with no control flow between them - otherwise, why compare them? To be sure, I manually reviewed every single use of EffectAnalyzer::canReorder/invalidate in the entire codebase. I've also been fuzzing this for several days (hundreds of thousands of iterations), and have not seen any problem. This was motivated by seeing that #5744 should be able to do more work in TNH mode, but it wasn't. New tests show the benefits there in OptimizeCasts as well as in SimplifyLocals.
*	DeadArgumentElimination: Do not error on bottom types in result refining (#5763)	Alon Zakai	2023-06-12	1	-0/+36
\| \| \| \|	More generally, the LUB computation that code relies on did not handle bottom types properly.
*	ConstantFieldPropagation: Track copied values properly (#5761)	Alon Zakai	2023-06-12	1	-0/+70
\| \| \| \|	The logic ignored copied values, which was fine for struct.get operations but not for struct.new.
*	Update br_on_cast binary and text format (#5762)	Thomas Lively	2023-06-12	11	-113/+44
\| \| \| \| \| \| \| \| \| \| \| \|	The final versions of the br_on_cast and br_on_cast_fail instructions have two reference type annotations: one for the input type and one for the cast target type. In the binary format, this is represented as a flags byte followed by two encoded heap types. Upgrade all of the tests at once to use the new versions of the instructions and drop support for the old instructions from the text parser. Keep support in the binary parser to avoid breaking users, though. Drop some binary tests of deprecated instruction encodings that would be more effort to update than they're worth. Re-land with fixes of #5734
*	TypeRefining: Fix a bug with chains of StructGets (#5757)	Alon Zakai	2023-06-08	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \|	If we have (struct.get $A (struct.get $B then if both types end up refined we may have a problem. If the inner one is refined to emit nullref then the outer one no longer knows what type it is, since it depends on the type of the ref child for that in our IR. We can't just skip updating it, as the outside may depend on its new refined type to validate. To avoid errors here, just make this code that is effectively unreachable also actually unreachable.
*	[Strings] Fix non-nullable string emitting in the binary format (#5756)	Alon Zakai	2023-06-07	1	-0/+14
\| \| \|	Related to #5737 which did something similar for other types.
*	Move casts which are immediate children of local.gets to earlier local.gets ↵	Bruce He	2023-06-06	1	-7/+842
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#5744) In the OptimizeCasts pass, it is useful to move more refined casts as early as possible without causing side-effects. This will allow such casts to potentially trap earlier, and will allow the OptimizeCasts pass to use more refined casts earlier. This change allows a more refined cast to be duplicated at an earlier local.get expression. The later instance of the cast will then be eliminated in a later optimization pass. For example, if we have the following instructions: (drop (local.get $x) ) (drop (ref.cast $A (local.get $x) ) (drop (ref.cast $B (local.get $x) ) ) Where $B is a sublcass of $A, we can convert this to: (drop (ref.cast $B (local.get $x) ) ) (drop (ref.cast $A (local.get $x) ) (drop (ref.cast $B (local.get $x) ) ) Concretely we will save the first cast to a local and use it in the other local.gets.
*	Fix emitting of function reference types without GC (#5737)	Thomas Lively	2023-06-05	1	-0/+61
\| \| \| \| \| \| \| \| \|	We previously had logic to emit GC types used in the IR as their corresponding top types when GC was not enabled (so e.g. nullfuncref would be emitted as funcref), but the logic was not robust enough and non-null function references were not properly emitted as funcref. Refactor the relevant code to be more robust and future-proof, and add a test demonstrating that the lowering works as intended.
*	StackIR: Remove nops (#5746)	Alon Zakai	2023-05-30	2	-6/+8
\| \| \| \| \| \| \|	No nop instruction is necessary in wasm, so in StackIR we can simply remove them all. Fixes #5745
*	wasm-merge: Preserve imports when copying module items (#5743)	Jérôme Vouillon	2023-05-26	6	-8/+72
\| \| \| \|	The import information of Tags and Memories was not preserved.
*	Revert "Update br_on_cast binary and text format (#5734)" (#5740)	Alon Zakai	2023-05-23	11	-44/+113
\| \| \| \| \| \| \|	This reverts commit b7b1d0df29df14634d2c680d1d2c351b624b4fbb. See comment at the end of #5734: It turns out that dropping the old opcodes causes problems for current users, so let's revert this for now, and later we can figure out how best to do the update.
*	TypeSSA: Handle collisions by adding a hash to ensure a fresh rec group (#5724)	Alon Zakai	2023-05-19	1	-0/+33
\| \| \|	Fixes #5720
*	Update br_on_cast binary and text format (#5734)	Thomas Lively	2023-05-19	11	-113/+44
\| \| \| \| \| \| \| \| \| \|	The final versions of the br_on_cast and br_on_cast_fail instructions have two reference type annotations: one for the input type and one for the cast target type. In the binary format, this is represented as a flags byte followed by two encoded heap types. Since these instructions have been in flux for a while, do not attempt to maintain backward compatibility with older versions of the instructions. Instead, upgrade all of the tests at once to use the new versions of the instructions. Drop some binary tests of deprecated instruction encodings that would be more effort to update than they're worth.
*	Vacuum code leading up to a trap in TrapsNeverHappen mode (#5228)	Alon Zakai	2023-05-17	2	-2/+436
\| \| \| \| \| \| \| \| \| \| \| \|	This adds two rules to vacuum in TNH mode: if (..) trap() => if (..) {} { stuff, trap() } => {} That is, we assume traps never happen so an if will not branch to one, and code right before a trap can be assumed to not execute. Together, we should be removing practically all possible code in TNH mode (though we could also add support for br_if etc.).
*	Print function types on function imports in the text format (#5727)	Alon Zakai	2023-05-17	38	-120/+120
\| \| \| \|	The function type should be printed there just like for non-imported functions.