forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Refactor interaction between Pass and PassRunner (#5093)	Thomas Lively	2022-09-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously only WalkerPasses had access to the `getPassRunner` and `getPassOptions` methods. Move those methods to `Pass` so all passes can use them. As a result, the `PassRunner` passed to `Pass::run` and `Pass::runOnFunction` is no longer necessary, so remove it. Also update `Pass::create` to return a unique_ptr, which is more efficient than having it return a raw pointer only to have the `PassRunner` wrap that raw pointer in a `unique_ptr`. Delete the unused template `PassRunner::getLast()`, which looks like it was intended to enable retrieving previous analyses and has been in the code base since 2015 but is not implemented anywhere.
*	[NFC] Mark modifiesBinaryenIR=false in more places (#4869)	Alon Zakai	2022-08-03	1	-0/+2
\|
*	MergeSimilarFunctions optimization pass (#4414)	Yuta Saito	2022-03-03	1	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge similar functions that only differs constant values (like immediate operand of const and call insts) by parameterization. Performing this pass at post-link time can merge more functions across objects. Inspired by Swift compiler's optimization which is derived from LLVM's one: https://github.com/apple/swift/blob/main/lib/LLVMPasses/LLVMMergeFunctions.cpp https://github.com/llvm/llvm-project/blob/main/llvm/docs/MergeFunctions.rst The basic ideas here are constant value parameterization and direct callee parameterization by indirection. Constant value parameterization is like below: ;; Before (func $big-const-42 (result i32) [[many instr 1]] (i32.const 44) [[many instr 2]] ) (func $big-const-43 (result i32) [[many instr 1]] (i32.const 45) [[many instr 2]] ) ;; After (func $byn$mgfn-shared$big-const-42 (result i32) [[many instr 1]] (local.get $0) ;; parameterized!! [[many instr 2]] ) (func $big-const-42 (result i32) (call $byn$mgfn-shared$big-const-42 (i32.const 42) ) ) (func $big-const-43 (result i32) (call $byn$mgfn-shared$big-const-42 (i32.const 43) ) ) Direct callee parameterization is similar to the constant value parameterization, but it parameterizes callee function i by ref.func instead. Therefore it is enabled only when reference-types and typed-function-references features are enabled. I saw 1 ~ 2 % reduction for SwiftWasm binary and Ruby's wasm port using wasi-sdk, and 3 ~ 4.5% reduction for Unity WebGL binary when -Oz.
*	LocalCSE rewrite (#4079)	Alon Zakai	2021-08-17	1	-17/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Technically this is not a new pass, but it is a rewrite almost from scratch. Local Common Subexpression Elimination looks for repeated patterns, stuff like this: x = (a + b) + c y = a + b => temp = a + b x = temp + c y = temp The old pass worked on flat IR, which is inefficient, and was overly complicated because of that. The new pass uses a new algorithm that I think is pretty simple, see the detailed comment at the top. This keeps the pass enabled only in -O4, like before - right after flattening the IR. That is to make this as minimal a change as possible. Followups will enable the pass in the main pipeline, that is, we will finally be able to run it by default. (Note that to make the pass work well after flatten, an extra simplify-locals is added - the old pass used to do part of simplify-locals internally, which was one source of complexity. Even so, some of the -O4 tests have changes, due to minor factors - they are just minor orderings etc., which can be seen by inspecting the outputs before and after using e.g. --metrics) This plus some followup work leads to large wins on wasm GC output. On j2cl there is a common pattern of repeated struct.gets, so common that this pass removes 85% of all struct.gets, which makes the total binary 15% smaller. However, on LLVM-emitted code the benefit is minor, less than 1%.
*	Preserve Function HeapTypes (#3952)	Thomas Lively	2021-06-30	1	-2/+1
\| \| \| \| \| \| \| \| \|	When using nominal types, func.ref of two functions with identical signatures but different HeapTypes will yield different types. To preserve these semantics, Functions need to track their HeapTypes, not just their Signatures. This PR replaces the Signature field in Function with a HeapType field and adds new utility methods to make it almost as simple to update and query the function HeapType as it was to update and query the Function Signature.
*	Refactor hashing (#3023)	Daniel Wirtz	2020-08-12	1	-12/+13
\| \| \| \| \| \| \| \| \|	* Unifies internal hashing helpers to naturally integrate with std::hash * Removes the previous custom implementation * Computed hashes are now always size_t * Introduces a hash_combine helper * Fixes an overwritten partial hash in Relooper.cpp
*	Update Precompute to handle tuples (#2687)	Thomas Lively	2020-03-10	1	-1/+1
\| \| \| \| \| \|	This involves replacing `Literal::makeZero` with `Literal::makeZeroes` and `Literal::makeSingleZero` and updating `isConstantExpression` to handle constant tuples as well. Also makes `Literals` its own struct and adds convenience methods on it.
*	Fix LocalCSE's usable local selection (#2638)	Heejin Ahn	2020-02-05	1	-19/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have subtypes, we cannot reuse any local that contains the same expression, because that local's type can be a supertype. For example: ``` (local $0 anyref) (local $1 nullref) ... (local.set $0 (ref.null)) (local.set $1 (ref.null)) ;; cannot be replaced with (local.get $0) ``` This extends `usables` map's key to contain both `HashedExpression` and the local's type, so we can get the right usable local in presence of subtypes.
*	Remove implicit conversion operators from Type (#2577)	Thomas Lively	2020-01-08	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	* Remove implicit conversion operators from Type Now types must be explicitly converted to uint32_t with Type::getID or to ValueType with Type::getVT. This fixes #2572 for switches that use Type::getVT. * getVT => getSingle
*	Remove FunctionType (#2510)	Thomas Lively	2019-12-11	1	-9/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Function signatures were previously redundantly stored on Function objects as well as on FunctionType objects. These two signature representations had to always be kept in sync, which was error-prone and needlessly complex. This PR takes advantage of the new ability of Type to represent multiple value types by consolidating function signatures as a pair of Types (params and results) stored on the Function object. Since there are no longer module-global named function types, significant changes had to be made to the printing and emitting of function types, as well as their parsing and manipulation in various passes. The C and JS APIs and their tests also had to be updated to remove named function types.
*	clang-tidy braces changes (#2075)	Alon Zakai	2019-05-01	1	-1/+2
\| \| \|	Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
*	Apply format changes from #2048 (#2059)	Alon Zakai	2019-04-26	1	-16/+17
\| \| \|	Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
*	Relooper CFG optimizations (#1759)	Alon Zakai	2018-11-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the relooper would do some optimizations when deciding when to use an if vs a switch, how to group blocks, etc. This PR adds an additional pre-optimization phase with some basic but useful simplify-cfg style passes, * Skip empty blocks when they have just one exit. * Merge exiting branches when they are equivalent. * Canonicalize block contents to make such comparisons more useful. * Turn a trivial one-target switch into a simple branch. This can help in noticeable ways when running the rereloop pass, e.g. on LLVM wasm backend output. Also: * Binaryen C API changes to the relooper, which now gets a Module for its constructor. It needs it for the optimizations, as it may construct new nodes. * Many relooper-fuzzer improvements. * Clean up HashType usage.
*	Add missing include guard (#1713)	Michał Janiszewski	2018-10-29	1	-0/+1
\|
*	Stack IR (#1623)	Alon Zakai	2018-07-30	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format. This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented, DCE local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get. Block removal - remove any blocks with no branches, as they are valid in wasm binary format. Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR. Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes: Generate stack IR Optimize stack IR (might be worth splitting out into separate passes eventually) Print stack IR for debugging purposes Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR. Miscellaneous notes: Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways. Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present. A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above. The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed. Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid. Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
*	duplicate-function-elimination improvements (#1590)	Alon Zakai	2018-06-07	1	-0/+47
\| \| \| \| \| \| \|	On a codebase with 370K functions, 160K were in fact duplicate (!)... and it took many many passes to figure that out, over 2 minutes in fact (!), as A and B may be identical only after we see that the functions C1, C2 that they call are identical (so there can be long "chains" here). To avoid this, limit how many passes we do. In -O1, just do one pass - that gets most duplicates. In -O2, do 10 passes - that gets almost all of it on this codebase. And in -O3 (or -Os/-Oz) do as many passes as necessary (i.e., the old behavior). This at least lets iteration builds (-O1) be nice and fast. This PR also refactors the hashing code used in that pass, moving it to nicer header files for clearer readability. Also some other minor cleanups in hashing code that helped debug this.
*	notation change: AST => IR (#1245)	Alon Zakai	2017-10-24	1	-0/+59
	The IR is indeed a tree, but not an "abstract syntax tree" since there is no language for which it is the syntax (except in the most trivial and meaningless sense).