forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Fix SSA on null refinement (#5886)	Alon Zakai	2023-08-17	1	-0/+11
\| \| \| \|	Similar to #5885 this was uncovered by #5881 #5882. Here we need to refinalize when we replace a local.get with a null, since the null's type is more refined.
*	ReFinalize in TypeSSA (#5885)	Alon Zakai	2023-08-17	1	-0/+4
\| \| \| \|	This has been a bug for a while but it became noticeable after #5881 #5882 which do more work in refinalization.
*	Further improve ref.cast during finalization (#5882)	Thomas Lively	2023-08-17	1	-16/+11
\| \| \| \| \| \|	We previously improved the nullability and heap type of the ref.cast target type in RefCast::finalize() based on what we knew about its input type. Simplify the code and make this improvement more powerful by using the greatest lower bound of the original cast target and input type.
*	Ensure br_on_cast* target type is subtype of input type (#5881)	Thomas Lively	2023-08-17	5	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The WasmGC spec will require that the target cast type of br_on_cast and br_on_cast_fail be a subtype of the input type, but so far Binaryen has not enforced this constraint, so it could produce invalid modules when optimizations refined the input to a br_on_cast* such that it was no longer a supertype of the cast target type. Fix this problem by setting the cast target type to be the greatest lower bound of the original cast target type and the current input type in `BrOn::finalize()`. This maintains the invariant that the cast target type should be a subtype of the input type and it also does not change cast behavior; any value that could make the original cast succeed at runtime necessarily inhabits both the original cast target type and the input type, so it also must inhabit their greatest lower bound and will make the updated cast succeed as well.
*	Improve cast optimizations (#5876)	Thomas Lively	2023-08-17	4	-158/+243
\| \| \| \| \| \| \| \| \| \| \| \|	Simplify the optimization of ref.cast and ref.test in OptimizeInstructions by moving the loop that examines fallthrough values one at a time out to a shared function in properties.h. Also simplify ref.cast optimization by analyzing the cast result in just one place. In addition to simplifying the code, also make the cast optimizations more powerful by analyzing the nullability and heap type of the cast value independently, resulting in a potentially more precise analysis of the cast behavior. Also improve optimization power by considering fallthrough values when optimizing the SuccessOnlyIfNonNull case.
*	Heap2Local: Refinalize if we end up refining (#5879)	Alon Zakai	2023-08-17	1	-5/+28
\| \| \| \| \| \| \| \|	We shouldn't need to in the general case, but the fuzzer found a corner case where we do need to, see the explanation + testcase, but basically Heap2Local replaces struct fields with locals, and the locals should have the same types, but if a field was somehow less refined for some reason, then the locals could actually be more refined. (And a field could be less refined if we read it from a typed that was under-refined due to a tee or such.)
*	[NFC] Factor `Result` and `MaybeResult` into a utility header (#5878)	Thomas Lively	2023-08-14	3	-54/+88
\| \| \|	Allow them to be used for more than just the new text parser.
*	Allow opting into standard GC encodings at compile time (#5868)	Thomas Lively	2023-08-10	1	-49/+153
\| \| \| \| \| \| \| \| \| \| \|	Depending on whether preprocessor macros USE_STANDARD_GC_ENCODINGS and USE_LEGACY_GC_ENCODINGS are defined, use the standard or legacy encodings for GC types and instructions. Default to the latter for now. When WasmGC reaches phase 4, we will switch the default to be the standard encodings instead. After users have time to update, we will remove the legacy encodings entirely. Tested manually by building with USE_STANDARD_GC_ENCODINGS locally and seeing that all the tests pass except for some tests that depend on the specific binary encoding.
*	Remove legacy WasmGC instructions (#5861)	Thomas Lively	2023-08-09	17	-550/+130
\| \| \| \| \|	Remove old, experimental instructions and type encodings that will not be shipped as part of WasmGC. Updating the encodings and text format to match the final spec is left as future work.
*	LinearExecutionWalker: Optionally connect blocks for Br and BrOn (#5869)	Alon Zakai	2023-08-09	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Br and BrOn can consider the code before and after them connected if it might be reached (which is the case if the Br has a condition, which BrOn always has). The wasm2js changes may look a little odd as some of them have this: i64toi32_i32$1 = i64toi32_i32$2; i64toi32_i32$1 = i64toi32_i32$2; I looked into that and the reason is that those outputs are not optimized, and also even in unoptimized wasm2js we do run simplify-locals once (to try to reduce the downsides of flatten). As a result, this PR makes a difference there, and that difference can lead to such odd duplicated code after other operations. However, there are no changes to optimized wasm2js outputs, so there is no actual problem. Followup to #5860.
*	OptimizeCasts: Connect adjacent blocks in LinearExecutionWalker (#5866)	Alon Zakai	2023-08-08	1	-0/+10
\| \| \| \| \| \| \|	Followup to #5860, this does the same for (part of) OptimizeCasts. As there, this is valid because it's ok if we branch away. This part of the pass picks a different local to get when it knows locals have the same values but one is more refined. It is ok to add a tee earlier even if it isn't used later.
*	LocalCSE: Connect adjacent blocks in LinearExecutionWalker (#5867)	Alon Zakai	2023-08-08	1	-0/+11
\| \| \| \| \| \| \|	Followup to #5860, this does the same for LocalCSE. As there, this is valid because it's ok if we branch away. This pass adds a local.tee of a reused value and then gets it later, and it's ok to add a tee even if we branch away and do not use it.
*	SimplifyGlobals: Connect adjacent blocks in LinearExecutionWalker (#5865)	Alon Zakai	2023-08-08	1	-0/+5
\| \| \| \| \| \| \|	Followup to #5860, this does the same for SimplifyGlobals as for SimplifyLocals. As there, this is valid because it's ok if we branch away. This part of the pass applies a global value to a global.get based on a dominating global.set, so any dominance is good enough for us.
*	LinearExecutionTraversal: Add an option to connect adjacent code, use in ↵	Alon Zakai	2023-08-08	2	-5/+47
\| \| \| \| \| \| \| \| \| \| \|	SimplifyLocals (#5860) This addresses most of the minor regression from the correctness fix in #5857. That PR makes us consider calls as branching, but in some cases it is ok to ignore that branching (see the comment in the code here), which this PR allows as an option. This undoes one test change from that PR, showing it undoes the regression for SimplifyLocals. More tests are added to cover this specifically as well.
*	Fix LinearExecutionWalker on calls (#5857)	Alon Zakai	2023-08-07	2	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Calls were simply not handled there, so we could think we were still in the same basic block when we were not, affecting various passes (but somehow this went unnoticed until the TNHOracle #5850 ran on some particular Java code). One existing test was affected, and two new tests are added: one for TNHOracle where I detected this, and one in OptimizeCasts which is perhaps a simpler way to see the problem. All the cases but the TNH one, however, do not need this fix for correctness since they actually don't care if a call would throw. As a TODO, we should find a way to undo this minor regression. The regression only affects builds with EH enabled, though, so most users should be unaffected even in the interm.
*	Lattice to model Stack (#5849)	Bruce He	2023-08-03	4	-9/+329
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change introduces StackLattice, a lattice to model stack-related behavior. It is templated on a separate lattice whose elements model some property of individual values on the stack. The StackLattice allows users to access the top of the stack, push abstract values, and pop them. Comparisons and least upper bound operations are done on a value by value basis starting from the top of the stack and moving toward the bottom. This is because it allows stacks from different scopes to be joined easily. An application of StackLattice is to model the wasm value stack. The goal is to organize lattice elements representing individual stack values in a natural way which mirrors the wasm value stack. Transfer functions operate on each stack value individually. The stack lattice is an intermediate structure which is not intended to be directly operated on. Rather, it simulates the push and pop behavior of instructions.
*	[Outlining] Integration test for suffix_tree and stringify (#5839)	Ashley Nelson	2023-08-03	1	-0/+11
\| \| \|	Adds an integration test that identifies the substrings of a stringified wasm module using the suffix_tree.
*	[NFC] Move ModuleUtils copying and renaming logic from header to cpp (#5855)	Alon Zakai	2023-08-02	11	-188/+227
\| \| \| \| \| \| \| \|	None of that code is speed-sensitive, or at least doesn't need to be inlined to be fast. Move it to cpp for faster compile times. This caused a cascade of necessary header fixes (i.e. after removing unneeded header inclusions in module-utils.h, files that improperly depended on that stopped working and needed an added include).
*	[NFC] Remove some unneeded cruft from PossibleContents::intersect() (#5854)	Alon Zakai	2023-08-02	1	-25/+4
\| \| \| \| \| \| \| \|	In order of appearance in this diff: * Use heapType rather than otherHeapType when either will do, for brevity. * We checked isNull after checking for isLiteral (line 173), but every null is a literal. * Calling setNoneOrNull simplifies some repetitive code. * We checked isLiteral yet again lower down.
*	Fix a fuzz bug in TypeMapper (#5851)	Thomas Lively	2023-08-02	5	-18/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TypeMapper is a utility used to globally rewrite types, mapping some eliminated source types into destination types they should be replaced with. This was previously done by first rewriting all the types in the IR according to the given mapping, then rewriting the type definitions and updating all the types in the IR again. Not only was doing the rewriting twice inefficient, it also introduced a subtle bug where the set of private types eligible to be rewritten could be inconsistent because updating types in the IR could change the types of control flow structures. The fuzzer found a case where this inconsistency caused the type rebuilding to fail. Fix the bug by first building the new types with the mapping applied and only then rewriting the IR a single time. Also add a `TypeBuilder::dump` utility for use in debugging. Fixes #5845.
*	GUFA: Infer using TrapsNeverHappen (#5850)	Alon Zakai	2023-08-02	3	-13/+556
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a TrapsNeverHappen oracle that is used inside the main PossibleContents oracle of GUFA. The idea is that when traps never happen we can reason "backwards" from information to things that must be true before it: temp = x.field; x.cast_to<Y>(); // Y is a subtype of x's type X Here we cast x to a subtype. If we assume traps never happen then the cast must succeed, and that means we can assume we had a Y on the previous line, where perhaps that information lets us infer the value of x.field. This PR focuses on calls, which are the more interesting situation to optimize because other passes do some work already inside functions. Specifically, we look for things that will trap in the called function or the caller, such as if the called function always casts a param to some type, we can assume the caller passes such a type in. And if we have a call_ref then any target that would trap cannot be called (at least in a closed world). This has some benefits, in particular when combined with --gufa-cast-all since that casts more things, which lets us apply the inferences made here. I see 3.3% fewer call_ref instructions on a Kotlin testcase, for example. This helps more on -Os when we inline less.
*	[Wasm GC] Stop printing deprecated cast etc. instructions (#5852)	Thomas Lively	2023-08-02	1	-60/+0
\| \| \| \| \| \| \| \|	Stop printing `ref.as_i31`, `br_on_func`, etc. because they have been removed from the spec and are no longer supported by V8. #5614 already made this change for the binary format. Like that PR, leave reading unmodified in case someone is still using these instructions (even though they are useless). They will be fully removed in a future PR as we finalize things ahead of standardizing WasmGC.
*	PossibleContents: Support more intersection types (#5847)	Alon Zakai	2023-07-31	2	-29/+68
\|
*	Fix binary writing of strings without GC enabled (#5836)	Alon Zakai	2023-07-31	1	-4/+10
\|
*	Convert lattice compare function to non-static (#5848)	Bruce He	2023-07-28	1	-4/+5
\| \| \| \| \|	Changes the static asserts checking a lattice type to require a non-static compare function instead of a static one. Also changes the compare functions of existing lattices to be non-static.
*	GUFA: Add a version that casts all of our inferences (#5846)	Alon Zakai	2023-07-27	3	-12/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GUFA refines existing casts, but does not add new casts for fear of increasing code size and adding more cast operations at runtime. This PR adds a version that does add all those casts, and it looks like at least code size improves rather than regresses, at least on J2Wasm and Kotlin. That is, this pass adds a lot more casts, but subsequent optimizations benefit enough to shrink overall code size. However, this may still not be worthwhile, as even if code size decreases we may end up doing more casts at runtime, and those casts might be hard to remove, e.g.: (call $foo (x) ;; inferred to be non-null ) (func $foo (param (ref null $A) => (call $foo (ref.cast $A (x) ;; add a cast here ) (func $foo (param (ref $A) ;; later pass refines here That new cast cannot be removed after we refine the function parameter. If the function never benefits from the fact that the input is non-null, then the cast is wasted work (e.g. if the function only compares the input to another value). To use this new pass, try --gufa-cast-all rather than --gufa. As with normal GUFA, running the full optimizer afterwards is important, and even more important in order to get rid of as many of the new casts as possible.
*	Fix a crash in TypeRefining on bottom types (#5842)	Alon Zakai	2023-07-27	1	-2/+7
\| \| \|	Followup to #5840
*	CFGWalker: Allow users to ignore branches outside the function [NFC] (#5838)	Alon Zakai	2023-07-26	4	-7/+38
\| \| \| \| \| \| \| \| \|	A pass that just operates on locals, for example, does not care about branches outside of the function. That means that when we see a call, then even if EH is enabled we don't need to create a new basic block right after it (unless the call is inside a try-catch - then it might branch to the catch, of course). This makes CFG-using passes 9% faster compared to before this PR. This plus #5827 offset the slowdown from #5823 and overall give an improvement compared to before.
*	Add a Fuzzer for Lattice and Transfer Function Properties (#5831)	Bruce He	2023-07-26	3	-0/+522
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds a fuzzer which checks the following properties in abstract interpretation static analyses. - Transfer Function Monotonicity - Lattice Element Reflexivity - Lattice Element Transitivity - Lattice Element Anti-Symmetry This is done by randomly generating a module and using its functions as transfer function inputs, along with randomly generated lattice elements (states). Lattice element properties are fuzzed from the randomly generated states also.
*	TypeRefining: Add casts when we must (#5840)	Alon Zakai	2023-07-26	2	-1/+66
\| \| \| \| \| \| \| \| \|	See the example in the code and test for a situation that requires this for validation. To fix validation we add a cast. That should practically always be removed by later optimizations, and the fact it took the fuzzer this long to even find such a situation also adds confidence that this won't be adding overhead (and in this situation, the optimizer will definitely remove the cast).
*	SmallVector iteration improvements (#5825)	Alon Zakai	2023-07-26	1	-1/+26
\|
*	End the current basic block on a Call (#5823)	Alon Zakai	2023-07-26	11	-21/+33
\| \| \| \| \| \| \| \| \| \| \| \| \|	Before this PR, if a call had no paths to a catch in the same function then we skipped creating a new basic block right after it. As a result, we could have a call in the middle of a basic block. If EH is enabled that means we might transfer control flow out of the function from the middle of a block. But it is better to have the property that any transfer of control flow - to another basic block, or outside of the function - can only happen at the end of a basic block. This causes some overhead, but a subsequent PR (#5838) will remove that as a followup, and this PR adds a little code to pass the module and check if EH is enabled, and avoid the overhead if not, which at least avoids regressing the non-EH case until that followup lands.
*	Remove incorrect usages of the term "setses" (#5841)	Bruce He	2023-07-26	1	-20/+20
\| \| \|	This change applies to the Reaching Definitions Analysis.
*	wasm-merge: Fix locals in merged start (#5837)	Alon Zakai	2023-07-26	1	-12/+14
\| \| \| \| \| \| \| \| \|	Start functions can have locals, which we previously ignored as we just concatenated the bodies together. This makes us copy the second start and call that, keeping them separate (the optimizer can then inline, if that makes sense). Fixes #5835
*	[Outlining] Changing stringify values to 32-bit (#5832)	Ashley Nelson	2023-07-21	2	-6/+10
\| \| \| \| \|	The LLVM suffix tree expects to be provided with a vector of 32-bit unsigned integers. This PR makes it easier to integrate our wasm program string with the suffix tree. Because the range of possible values is reduced from 2^64 to 2^32, a signed integer was added to manage the next separator value and ensure we're using every possible negative number.
*	SimplifyLocals: Refinalize after removing redundant tees (#5830)	Alon Zakai	2023-07-21	1	-0/+3
\|
*	C API: Add BinaryenAddFunctionWithHeapType which takes a heap type (#5829)	Alon Zakai	2023-07-21	2	-9/+37
\| \| \| \| \| \| \| \| \| \| \|	This is necessary for WasmGC producers using the C API, so that they can set the heap type of functions. Otherwise the heap type is set structurally using params and results in the old API. The old API is kept for backwards compatibility and convenience (for the structural case, which is all code before WasmGC basically). Fixes #5826
*	Add support for debug printing of functions (#5828)	Alon Zakai	2023-07-20	2	-0/+9
\|
*	[NFC] Avoid using ControlFlowWalker in CFGWalker (#5827)	Alon Zakai	2023-07-20	1	-10/+9
\| \| \| \| \| \|	ControlFlowWalker builds a stack of control flow items and allows finding the target of a branch using it. But the only use CFGWalker made of that was to map a block or a loop to its branches. We can just use the name of the block or loop in the map, and avoid the extra overhead.
*	Static Analysis: Add an API to get the block index of an expression (#5822)	Alon Zakai	2023-07-20	2	-0/+32
\|
*	Reaching Definitions Analysis for LocalGraph (#5817)	Bruce He	2023-07-19	8	-67/+362
\| \| \| \| \| \| \| \| \| \| \|	This change implements a reaching definitions analysis which is intended to be equivalent to the information provided by LocalGraph, specifically the Flower class of LocalGraph. It also introduces a CRTP utility in visitor-transfer-function.h which implements most commonly found visitor-type transfer function functionalities. The MonotoneCFGAnalyzer is also modified to add a phase to collect results after the analysis is solved from the final CFG states.
*	[NFC] Allow running multiple analyses in ParallelFunctionAnalysis (#5824)	Alon Zakai	2023-07-19	1	-1/+10
\|
*	MemoryPacking: memory.init fixes for 64 bit (#5809)	Arthur Islamov	2023-07-18	1	-12/+15
\| \| \| \| \| \|	Fixes emscripten-core/emscripten#17485 This allows emscripten to complie code with MEMORY64 + PTHREADS by fixing using the proper pointer type in the MemoryPacking pass.
*	[Outlining] LLVM Suffix Tree (#5821)	Ashley Nelson	2023-07-18	5	-1/+754
\| \| \| \| \|	This PR adds LLVM's suffix tree data structure to Binaryen. This suffix tree is implemented using Ukkonen's algorithm for linear-time suffix tree construction, and is intended for fast substring queries. Note: All of the .h and .cpp files included are from LLVM. These files were copied directly instead of imported into our existing LLVM integration (in third_party/) to avoid bumping the commit hash and avoid the potential for complications with upstream changes.
*	wasm-merge: Error on import loops (#5820)	Alon Zakai	2023-07-17	1	-6/+10
\|
*	[wasm-merge] Handle chains of import/export (#5813)	Jérôme Vouillon	2023-07-17	1	-1/+16
\| \| \| \| \| \| \|	When a module item is imported and directly reexported by an intermediate module, we need to perform several name lookups and use its name in the initial module rather than the intermediate name when fusing imports and exports.
*	Adding license header to new files (#5819)	Ashley Nelson	2023-07-13	3	-0/+48
\|
*	[Outlining] HashStringifyWalker (#5810)	Ashley Nelson	2023-07-13	3	-1/+97
\| \| \| \| \|	This PR is part of the effort to bring Outlining to Binaryen. HashStringifyWalker is a subclass of StringifyWalker #5772, and used to encode a wasm module as a "string". Our "string" is a vector and each symbol is a uint64_t, providing us with a capacity of 2^64 symbols.
*	Small fixes to lattice and analyzer (#5815)	Bruce He	2023-07-13	2	-9/+9
\| \| \|	Updates comments. Moves wasm-traversal.h to transfer function classes.
*	Add a pass to sort functions by name (#5811)	Alon Zakai	2023-07-12	3	-0/+22
\|