forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Outlining] Adds Outlining pass (#6110)	Ashley Nelson	2023-11-13	4	-26/+118
\| \| \|	Adds an outlining pass that performs outlining on a module end to end, and two tests.
*	[NFC] StackIR: Add comments on local2stack handling of tuples (#6092)	Alon Zakai	2023-11-09	1	-1/+138
\| \| \| \|	Also add testcases to be comprehensive and notice changes if we ever decide to modify that behavior.
*	Heap2Local: Fix an ordering issue with children having different ↵	Alon Zakai	2023-11-09	1	-0/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	interactions with a parent (#6089) We had a simple rule that if we reach an expression twice then we give up, which makes sense for say a block: if one allocation flows out of it, then another can't - it would get mixed in with the other one, which is a case we don't optimize. However, there are cases where a parent has multiple children and different interactions with them, like a struct.set: the reference child does not escape, but the value child does. Before this PR if we reached the value child first, we'd mark the parent as seen, and then the reference child would see it isn't the first to get here, and not optimize. To fix this, reorder the code to handle this case. The manner of interaction between the child and the parent decides whether we mark the parent as seen and to be further avoided. Noticed by the determinism fuzzer, since the order of analysis mattered here.
*	[analysis] Add an experimental TypeGeneralizing optimization (#6080)	Thomas Lively	2023-11-08	1	-0/+277
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new optimization will eventually weaken casts by generalizing (i.e. un-refining) their output types. If a cast is weakened enough that its output type is a supertype of its input type, the cast will be able to be removed by OptimizeInstructions. Unlike refining cast inputs, generalizing cast outputs can break module validation. For example, if the result of a cast is stored to a local and the cast is weakened enough that its output type is no longer a subtype of that local's type, then the local.set after the cast will no longer validate. To avoid this validation failure, this optimization would have to generalize the type of the local as well. In general, the more we can generalize the types of program locations, the more we can weaken casts of values that flow into those locations. This initial implementation only generalizes the types of locals and does not actually weaken casts yet. It serves as a proof of concept for the analysis required to perform the full optimization, though. The analysis uses the new analysis framework to perform a reverse analysis tracking type requirements for each local and reference-typed stack value in a function. Planned and potential future work includes: - Implementing the transfer function for all kinds of expressions. - Tracking requirements on the dynamic types of each location to generalize allocations as well. - Making the analysis interprocedural and generalizing the types of more program locations. - Optimizing tuple-typed locations. - Generalizing only those locations necessary to eliminate at least one cast (although this would make the anlysis bidirectional, so it is probably better left to separate passes).
*	Move --separate-data-segments into a pass so it can be run from wasm-opt (#6088)	Sam Clegg	2023-11-08	3	-0/+28
\| \| \| \| \| \| \| \|	Because we currently strip some data segments (i.e. EM_JS strings) during `--post-emscripten` this is too late as `--separate-data-segments` always runs in `wasm-emscripten-finalize`. Once emscripten switches over to using the pass directly we can remove the support from `wasm-emscripten-finalize`
*	LocalCSE: Do not optimize small things like global.get (#6087)	Alon Zakai	2023-11-08	2	-44/+39
\| \| \| \| \| \| \| \| \|	LocalCSE is nice for large expressions, but for small things it has always been of unclear benefit since VMs also do GVN/CSE anyhow. So we are likely not speeding anything up, but hopefully we are reducing code size at least. Doing LocalCSE on something small like a global.get is very possibly going to increase code size, however (since we add a tee, and since the local gets are of similar size to global gets - depends on LUB sizes). On real-world Java code that overhead is noticeable, so this PR makes us more careful, and we skip things of size 1 (no children).
*	[Parser] Parse `call` and `return_call` (#6086)	Thomas Lively	2023-11-07	1	-66/+92
\| \| \| \|	To support parsing calls, add support for parsing function indices and building calls with IRBuilder.
*	Update CFGWalker to generate consolidated exit blocks (#6079)	Thomas Lively	2023-11-06	1	-2/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously CFGWalker designated a particular block as the "exit" block, but it was just the block that happened to appear at the end of the function that returned values by implicitly flowing them out. That exit block was not tied in any way to other blocks that might end in returns, so analyses that needed to perform some action at the end of the function would have had to perform that action at the end of the designated exit block but also separately at any return instruction. Update CFGWalker to make the exit block a synthetic empty block that is a successor of all other blocks tthat implicitly or explicitly return from the function in case there are multiple such blocks, or to make the exit block the single returning block if there is only one. This means that analyses will only perform their end-of-function actions at the end of the exit block rather than additionally at every return instruction.
*	Implement table.copy (#6078)	Alon Zakai	2023-11-06	5	-40/+2324
\| \| \|	Helps #5951
*	[analysis] Allow joining a single vector element efficiently (#6071)	Thomas Lively	2023-11-02	1	-0/+43
\| \| \| \| \| \| \| \| \|	Previously, modifying a single vector element of a `Shared<Vector>` element required materializing a full vector to do the join. When there is just a single element to update, materializing all the other elements with bottom value is useless work. Add a `Vector<L>::SingletonElement` utility that represents but does not materialize a vector with a single non-bottom element and allow it to be passed to `Vector<L>::join`. Also update `Shared` and `Inverted` so that `SingletonElement` joins still work on vectors wrapped in those other lattices.
*	[analysis] Simplify the stack lattice (#6069)	Thomas Lively	2023-11-02	2	-97/+26
\| \| \| \| \| \| \|	Remove the ability to represent the top element of the stack lattice since it isn't necessary. Also simplify the element type to be a simple vector, update the lattice method implementations to be more consistent with implementations in other lattices, and make the tests more consistent with the tests for other lattices.
*	[analysis] Add a "Shared" lattice to represent shared state (#6067)	Thomas Lively	2023-11-02	1	-0/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The analysis framework stores a separate lattice element for each basic block being analyzed to represent the program state at the beginning of the block. However, in many analyses a significant portion of program state is not flow-sensitive, so does not benefit from having a separate copy per block. For example, an analysis might track constraints on the types of locals that do not vary across blocks, so it really only needs a single copy of the constrains for each local. It would be correct to simply duplicate the state across blocks anyway, but it would not be efficient. To make it possible to share a single copy of a lattice element across basic blocks, introduce a `Shared<L>` lattice. Mathematically, this lattice represents a single ascending chain in the underlying lattice and its elements are ordered according to sequence numbers corresponding to positions in that chain. Concretely, though, the `Shared<L>` lattice only ever materializes a single, monotonically increasing element of `L` and all of its elements provide access to that shared underlying element. `Shared<L>` will let us get the benefits of having mutable shared state in the concrete implementation of analyses without losing the benefits of keeping those analyses expressible purely in terms of the monotone framework.
*	Fuzzer: Better handling of globals from initial content (#6072)	Alon Zakai	2023-11-01	1	-45/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the fuzzer never added gets or sets of globals from initial content. That was an oversight, I'm pretty sure - it's just that the code that sets up the lists from which we pick globals for gets and sets was in another place. That is, any globals in the initial content file were never used in new random code the fuzzer generates (only new globals the fuzzer generated were used there). This PR allows us to use those globals, but also ignores them with some probability, to avoid breaking patterns like "once" globals (that we want to only be used from initial content, at least much of the time). Also simplify the code here: we don't need isInvalidGlobal just to handle the hang limit global, which is already handled by not being added to the lists we pick names from anyhow.
*	[Wasm64] Fix PostEmscripten::optimizeExceptions on invokes with an i64 ↵	Alon Zakai	2023-11-01	1	-0/+31
\| \| \| \| \| \| \| \|	argument (#6074) In wasm64, function pointers are 64-bit like all pointers. fixes #6073
*	NonNullable => !Defaultable in SimplifyLocals (#6070)	Alon Zakai	2023-11-01	1	-0/+24
\| \| \|	We handled references but not tuples in one place.
*	[analysis][NFC] Refactor lattice unit tests (#6065)	Thomas Lively	2023-11-01	1	-420/+145
\| \| \| \| \| \|	Many of the lattice tests were essentially copy-pasted from one lattice to the next because they all tested isomorphic subsets of the various lattices, specifically in the shape of a diamond. Refactor the code so that all lattices that have tests of this shape use the same utility test functions.
*	[analysis] Add a lattice for value types (#6064)	Thomas Lively	2023-11-01	1	-0/+112
\| \| \| \| \| \|	Add a lattice that is a thin wrapper around `wasm::Type` giving it the interface of a lattice. As usual, `Type::unreachable` is the bottom element, but unlike in the underlying API, we uniformly treat `Type::none` as the top type so that we have a proper lattice.
*	OnceReduction: Optimize bodies of trivial "once" functions (#6061)	Alon Zakai	2023-10-31	1	-50/+456
\| \| \| \|	In particular, if the body just calls another "once" function, then we can skip the early-exit logic.
*	[analysis] Add a tuple lattice (#6062)	Thomas Lively	2023-10-31	1	-0/+117
\| \| \| \| \| \| \|	This lattice combines any number of other lattices into a single lattice whose elements are tuples of elements of the other lattices. This will be one of the most important lattices in the analysis framework because it will be used to combine information about different parts of the program, e.g. locals and the value stack, into a single lattice.
*	[analysis] Implement a vector lattice (#6058)	Thomas Lively	2023-10-31	1	-0/+109
\| \| \| \| \| \| \| \| \| \|	The vector lattice is nearly identical to the array lattice, except that the size of the elements is specified at runtime when the lattice object is created rather than at compile time. The code and tests are largely copy-pasted and fixed up from the array implementation, but there are a couple differences. First, initializing vector elements does not need the template magic used to initialize array elements. Second, the obvious implementations of join and meet do not work for vectors of bools because they might be specialized to be bit vectors, so we need workarounds for that particular case.
*	[analysis] Implement an array lattice (#6057)	Thomas Lively	2023-10-31	1	-0/+109
\| \| \| \| \|	The elements of `Array<L, N>` lattice are arrays of length `N` of elements of `L`, compared pairwise with each other. This lattice is a concrete implementation of what would be written L^N with pen and paper.
*	Support one-line-one-function file format for asyncify lists (#6051)	Alexander Guryanov	2023-10-30	3	-0/+4
\| \| \| \| \| \| \|	If there are newlines in the list, then we split using them in a simple manner (that does not take into account nesting of any other delimiters). Fixes #6047 Fixes #5271
*	Allow rec groups of public function types in closed world (#6053)	Alon Zakai	2023-10-26	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Closed-world mode allows function types to escape if they are on exported functions, because that has been possible since wasm MVP and cannot be avoided. But we need to also allow all types in those type's rec groups as well. Consider this case: (module (rec (type $0 (func)) (type $1 (func)) ) (func "0" (type $0) (nop) ) (func "1" (type $1) (nop) ) ) The two exported functions make the two types public, so this module validates in closed world mode. Now imagine that metadce removes one export: (module (rec (type $0 (func)) (type $1 (func)) ) (func "0" (type $0) (nop) ) ;; The export "1" is gone. ) Before this PR that no longer validates, because it only marks the type $0 as public. But when a type is public that makes its entire rec group public, so $1 is errored on. To fix that, this PR allows all types in a rec group of an exported function's type, which makes that last module validate.
*	Fix Alpine CI by removing old node flag (#6054)	Thomas Lively	2023-10-26	1	-8/+8
\| \| \| \| \|	Apparently the version of node on the Alpine runner was updated and no longer recognizes the --experimental-wasm-threads option. Delete this option out of the test that was using it.
*	[analysis] Implement a Lift lattice (#6040)	Thomas Lively	2023-10-25	1	-0/+77
\| \| \| \|	This lattice "lifts" another lattice by inserting a new bottom element underneath it.
*	Precompute: Check defaultability, not nullability (#6052)	Alon Zakai	2023-10-25	1	-8/+33
\| \| \|	Followup to #6048, we did not handle nondefaultable tuples because of this.
*	[analysis] Implement Flat lattice (#6039)	Thomas Lively	2023-10-25	1	-0/+103
\| \| \| \| \|	Given a type `T`, `Flat<T>` is the lattice where none of the values of `T` are comparable except with themselves, but they are all greater than a common bottom element not in `T` and less than a common top element also not in `T`.
*	[analysis] Add a FullLattice concept and Inverted lattice (#6038)	Thomas Lively	2023-10-25	1	-3/+113
\| \| \| \| \| \|	The FullLattice concept extends the base Lattice with `getTop` and `meet` operations. The `Inverted` lattice uses these operations to reverse the order of an arbitrary full lattice, for example to create a lattice of integers ordered by `>` rather than by `<`.
*	[analysis] Implement an Int lattice (#6037)	Thomas Lively	2023-10-25	1	-0/+36
\| \| \|	Implement a generic lattice template for integral types ordered by `<`.
*	[analysis] Implement a Bool lattice (#6036)	Thomas Lively	2023-10-25	2	-0/+51
\| \| \| \| \|	This is a lattice with two elements: `false` is bottom and `true` is top. Add a new gtest file for testing lattices.
*	[analysis][NFC] Rename `makeLeastUpperBound` to `join` and move it to ↵	Thomas Lively	2023-10-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lattice (#6035) In general, the operation of taking the least upper bound of two lattice elements may depend on some state stored in the lattice object rather than in the elements themselves. To avoid forcing the elements to be larger and more complicated in that case (by storing a parent pointer back to the lattice), move the least upper bound operation to make it a method of the lattice rather than the lattice element. This is also more consistent with where we put e.g. the `compare` method. While we are at it, rename `makeLeastUpperBound` to `join`, which is much shorter and nicer. Usually we avoid using esoteric mathematical jargon like this, but "join" as a normal verb actually describes the operation nicely, so I think it is ok in this case.
*	[analysis] Simplify core analysis code (#6034)	Thomas Lively	2023-10-25	1	-169/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simplify the monotone analyzer by replacing all the state it used to store in `BlockState` with a simple vector of lattice elements. Use simple indices to refer to both blocks and their associated states in the vector. Remove the ability for transfer functions to control the initial enqueued order of basic blocks since that was a leaky abstraction. Replace the worklist with a UniqueDeferredQueue since that has generally proven to be more efficient in smiilarly contexts, and more importantly, it has a nicer API. Make miscellaneous simplifications to other code as well. Delete a few unit tests that exposed the order in which blocks were analyzed because they printed intermediate results. These tests should be replaced with tests of analyses' public APIs in the future.
*	Typed Continuations: Add cont type (#5998)	Frank Emrich	2023-10-24	3	-0/+92
\| \| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the [typed continuations proposal](https://github.com/wasmfx/specfx). This PR adds continuation types, of the form `(cont $foo)` for some function type `$foo`. The only notable changes affecting existing code are the following: - This is the first `HeapType` which has another `HeapType` (rather than, say, a `Type`) as its immediate child. This required fixes to certain traversals that have a flag for being at the toplevel of a type. - Some shared logic for parsing `HeapType`s has been factored out.
*	Fix unreachable code in LocalGraph by making it imprecise there (#6048)	Alon Zakai	2023-10-24	1	-8/+83
\| \| \| \| \| \| \| \| \| \|	Followup to #6046 - the fuzzer found we missed handling the case of the entry itself being unreachable, or of an unreachable loop later. Properly identifying unreachable code requires a flow analysis, unfortunately, so this PR gives up on that and instead allows LocalGraph to be imprecise in unreachable code. That avoids adding any overhead, but does mean the IR may be slightly confusing when debugging. It does not have any optimization downsides, however, as it only affects unreachable code. Also add a dump() impl in that file which helps debugging.
*	Fix handling of exported imported functions (#6044)	Alon Zakai	2023-10-24	2	-0/+28
\| \| \| \| \| \| \| \|	Two trivial places did not handle that case, and assumed an exported function was actually defined (and not imported). Also add some const stuff to fix compilation after this change. This was discovered by #6026
*	[NFC] LocalGraph: Optimize params with no sets (#6046)	Alon Zakai	2023-10-24	1	-6/+39
\| \| \| \| \| \| \| \| \| \| \|	If a local index has no sets, then all gets of that index read from the entry block (a param, or a zero for a local). This is actually a common case, where a param has no other set, and so it is worth optimizing, which this PR does by avoiding any flowing operation at all for that index: we just skip and write the entry block as the source of information for such gets. #6042 on precompute-propagate goes from 3 minutes to 2 seconds with this (!). But that testcase is rather special in that it is a huge function with many, many gets in it, so the overhead we remove is very noticeable there.
*	Partially revert #6026 (#6043)	Alon Zakai	2023-10-23	13	-42/+139
\| \| \|	That optimization uncovered some LLVM and Binaryen bugs.
*	Fix segfault in catch validator (#6032)	Philip Blair	2023-10-23	2	-0/+51
\| \| \| \| \|	The problem was if you construct a try expression which references a nonexistent tag in one of its catch blocks, the validation code successfully identified the null pointer but then proceeded to try to read from it.
*	[analysis][NFC] Move the stack lattice to analysis/lattices (#6030)	Thomas Lively	2023-10-20	1	-1/+1
\| \| \|	Also shorten various names in the implementation to improve readability.
*	RemoveUnusedModuleElements: Make exports skip trampolines (#6026)	Alon Zakai	2023-10-19	13	-105/+253
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we export a function that just calls another function, we can export that one instead. Then the one in the middle may be unused, function foo() { return bar(); } export foo; // can be an export of bar This saves a few bytes in rare cases, but probably more important is that it saves the trampoline, so if this is on a hot path, we save a call. Context: emscripten-core/emscripten#20478 (comment) In general this is not needed as inlining helps us out by inlining foo() into the caller (since foo is tiny, that always ends up happening). But exports are a case the inliner cannot handle, so we do it here.
*	SimplifyGlobals: Fold single-use globals to their use (#6023)	Alon Zakai	2023-10-18	1	-0/+332
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For example, (global $a (struct.new $A)) (global $b (struct.new $B (global.get $A))) => (global $b (struct.new $B (struct.new $A))) and the global $a is now unused. This is valid if $a has no other uses. This saves a little in code size, but should not really help otherwise, as we already look through immutable global.get operations in important optimizations. But the code size may matter if there are many such single- use globals.
*	[Outlining] Filter branches & returns (#6024)	Ashley Nelson	2023-10-18	1	-56/+95
\| \| \|	Adds a function to filter branch and return instructions from being included in potential outlining candidates.
*	Reuse existing function types for blocks (#6022)	Thomas Lively	2023-10-18	12	-66/+161
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Type annotations on multivalue blocks (and loops, ifs, and trys) are type indices that refer to function types in the type section. For these type annotations, the identities of the function types does not matter. As long as the referenced type has the correct parameters and results, it will be valid to use. Previously, when collecting module types, we always used the "default" function type for multivalue control flow, i.e. we used a final function type with no supertypes in a singleton rec group. However, in cases where the program already contains another function type with the expected signature, using the default type is unnecessary and bloats the type section. Update the type collecting code to reuse existing function types for multivalue control flow where possible rather than unconditionally adding the default function type. Similarly, update the binary writer to use the first heap type with the required signature when emitting annotations on multivalue control flow structures. To make this all testable, update the printer to print the type annotations as well, rather than just the result types. Since the parser was not able to parse those newly emitted type annotations, update the parser as well.
*	Fuzzer: Allow non-nullable locals (#6019)	Alon Zakai	2023-10-18	1	-42/+41
\| \| \| \| \| \| \|	Remove the code that avoided such locals. To avoid much new trapping, add logic to set a value to such locals if they have accesses that are not dominated by a set already. Also in makeTrivial only rarely emit a local.get of a non-nullable local (prefer a constant).
*	SimplifyGlobals: Run on function code in missing places (#6020)	Alon Zakai	2023-10-18	3	-4/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pass was written before we had relevant code in module locations, but now with GC we can have global.gets of more things. The scanner did not run on global code in a way that is not a problem yet, but will be for a later PR I'll open. It will be tested there. That is, right now there is no optimization that is confused by the fact that we did not scan code at the module level, but the next PR will do that. The use modifier did not run on global code either, which was an actual missed optimization opportunity: There are cases where we want to modify a global.get to point to another one, and such a get might be in global code, not just in a function. A test is added for that.
*	[Outlining] Filter Local Set (#6018)	Ashley Nelson	2023-10-18	1	-1/+42
\| \| \| \| \|	Adds a general purpose walker named FilterStringifyWalker, intended to walk control flow and take note of whether any of the expressions satisfy the condition. Also includes an << overload for SuffixTree::RepeatedSubstring to make debugging easier.
*	Remove autogeneration of instrument-memory64.wast (#6021)	Thomas Lively	2023-10-17	1	-2/+2
\| \| \| \| \|	This test file cannot be auto-generated because its output contains a hash of its input, but the comment enabling that auto updater was accidentally left in during development. Remove the comment.
*	[NFC] Some misc comments for clarification (#6017)	Alon Zakai	2023-10-17	1	-2/+4
\|
*	Added type to fix template deduction failure (#6014)	Ashley Nelson	2023-10-17	1	-2/+1
\| \| \|	Fixes #6003
*	Add getGeneralSuperType() that includes basic supers, and use in fuzzer (#6005)	Alon Zakai	2023-10-17	3	-42/+119
\| \| \| \|	With this, the fuzzer can replace e.g. an eq expression with a specific struct type, because now it is away that struct types have eq as their ancestor.