forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Wasm GC] Signature Pruning (#4545)	Alon Zakai	2022-03-25	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a new signature-pruning pass that prunes parameters from signature types where those parameters are never used in any function that has that type. This is similar to DeadArgumentElimination but works on a set of functions, and it can handle indirect calls. Also move a little code from SignatureRefining into a shared place to avoid duplication of logic to update signature types. This pattern happens in j2wasm code, for example if all method functions for some virtual method just return a constant and do not use the this pointer.
*	Add support for extended-const proposal (#4529)	Sam Clegg	2022-03-19	9	-0/+40
\| \| \|	See https://github.com/WebAssembly/extended-const
*	MergeSimilarFunctions optimization pass (#4414)	Yuta Saito	2022-03-03	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge similar functions that only differs constant values (like immediate operand of const and call insts) by parameterization. Performing this pass at post-link time can merge more functions across objects. Inspired by Swift compiler's optimization which is derived from LLVM's one: https://github.com/apple/swift/blob/main/lib/LLVMPasses/LLVMMergeFunctions.cpp https://github.com/llvm/llvm-project/blob/main/llvm/docs/MergeFunctions.rst The basic ideas here are constant value parameterization and direct callee parameterization by indirection. Constant value parameterization is like below: ;; Before (func $big-const-42 (result i32) [[many instr 1]] (i32.const 44) [[many instr 2]] ) (func $big-const-43 (result i32) [[many instr 1]] (i32.const 45) [[many instr 2]] ) ;; After (func $byn$mgfn-shared$big-const-42 (result i32) [[many instr 1]] (local.get $0) ;; parameterized!! [[many instr 2]] ) (func $big-const-42 (result i32) (call $byn$mgfn-shared$big-const-42 (i32.const 42) ) ) (func $big-const-43 (result i32) (call $byn$mgfn-shared$big-const-42 (i32.const 43) ) ) Direct callee parameterization is similar to the constant value parameterization, but it parameterizes callee function i by ref.func instead. Therefore it is enabled only when reference-types and typed-function-references features are enabled. I saw 1 ~ 2 % reduction for SwiftWasm binary and Ruby's wasm port using wasi-sdk, and 3 ~ 4.5% reduction for Unity WebGL binary when -Oz.
*	Clarify in tools help message that -O == -Os. (#4516)	t4lz	2022-02-16	2	-2/+2
\| \| \| \| \|	Introduce static consts with PassOptions Defaults. Add assertion to verify that the default options are the Os options. Also update the text in relevant tests.
*	[wasm-split] Add an --asyncify option (#4513)	Thomas Lively	2022-02-09	1	-0/+5
\| \| \| \| \| \| \|	Add an option for running the asyncify transformation on the primary module emitted by wasm-split. The idea is that the placeholder functions should be able to unwind the stack while the secondary module is asynchronously loaded, then once the placeholder functions have been patched out by the secondary module the stack should be rewound and end up in the correct secondary function.
*	Isorecursive type fuzzing (#4501)	Thomas Lively	2022-02-04	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Add support for isorecursive types to wasm-fuzz-types by generating recursion groups and ensuring that children types are only selected from candidates through the end of the current group. For non-isorecursive systems, treat all the types as belonging to a single group so that their behavior is unchanged. Also fix two small bugs found by the fuzzer: LUB calculation was taking the wrong path for isorecursive types and isorecursive validation was not handling basic heap types properly.
*	[Docs] Document wasm-ctor-eval (#4493)	Alon Zakai	2022-02-03	1	-1/+1
\|
*	Remove used wasm-emscripten-finalize option `--initial-stack-pointer` (#4490)	Sam Clegg	2022-02-01	1	-3/+0
\|
*	wasm-emscripten-finalize: Remove legacy --new-pic-abi option (#4483)	Sam Clegg	2022-01-27	2	-2/+29
\|
*	Remove NoExitRuntime pass (#4431)	Alon Zakai	2022-01-26	2	-8/+0
\| \| \| \|	After emscripten-core/emscripten#15905 lands Emscripten will no longer use it, and nothing else needs it AFAIK.
*	Add a `--hybrid` type system option (#4460)	Thomas Lively	2022-01-19	9	-0/+36
\| \| \| \| \|	Eventually this will enable the isorecursive hybrid type system described in https://github.com/WebAssembly/gc/pull/243, but for now it just throws a fatal error if used.
*	Add --no-emit-metadata option to wasm-emscripten-finalize (#4450)	Sam Clegg	2022-01-19	1	-0/+3
\| \| \| \| \| \|	This is useful for the case where we might want to finalize without extracting metadata. See: https://github.com/emscripten-core/emscripten/pull/15918
*	[ctor-eval] Add an option to keep some exports (#4441)	Alon Zakai	2022-01-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By default wasm-ctor-eval removes exports that it manages to completely eval (if it just partially evals then the export remains, but points to a function with partially-evalled contents). However, in some cases we do want to keep the export around even so, for example during fuzzing (as the fuzzer wants to call the same exports before and after wasm-ctor-eval runs) and also if there is an ABI we need to preserve (like if we manage to eval all of main()), or if the function returns a value (which we don't support yet, but this is a PR to prepare for that). Specifically, there is now a new option: --kept-exports foo,bar That is a list of exports to keep around. Note that when we keep around an export after evalling the ctor we make the export point to a new function. That new function just contains a nop, so that nothing happens when it is called. But the original function is kept around as it may have other callers, who we do not want to modify.
*	[ctor-eval] Add --ignore-external-input option (#4428)	Alon Zakai	2022-01-06	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	This is meant to address one of the main limitations of wasm-ctor-eval in emscripten atm, that libc++ global ctors will read env vars, which means they call an import, which stops us from evalling, emscripten-core/emscripten#15403 (comment) To handle that, this adds an option to ignore external input. When set, we can assume that no env vars will be read, no reading from stdin, no arguments to main(), etc. Perhaps these could each be separate options, but I think keeping it simple for now might be good enough.
*	Add categories to --help text (#4421)	Alon Zakai	2022-01-05	12	-698/+1900
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general shape of the --help output is now: ======================== wasm-foo Does the foo operation ======================== wasm-foo opts: -------------- --foo-bar .. Tool opts: ---------- .. The options are now in categories, with the more specific ones - most likely to be wanted by the user - first. I think this makes the list a lot less confusing. In particular, in wasm-opt all the opt passes are now in their own category. Also add a script to make it easy to update the help tests.
*	[Wasm GC] Signature Refining pass (#4326)	Alon Zakai	2021-11-19	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is fairly short and simple after the recent refactorings. This basically just finds all uses of each signature/function type, and then sees if it receives more specific types as params. It then rewrites the types if so. This just handles arguments so far, and not return types. This differs from DeadArgumentElimination's refineArguments() in that that pass modifies each function by itself, changing the type of the function as needed. That is only valid if the type is not observable, that is, if the function is called indirectly then DAE ignores it. This pass will work on the types themselves, so it considers all functions sharing a type as a whole, and when it upgrades that type it ends up affecting them all. This finds optimization opportunities on 4% of the total signature types in j2wasm. Those lead to some benefits in later opts, but the effect is not huge.
*	[Wasm GC] Global Refining pass (#4344)	Alon Zakai	2021-11-18	1	-0/+2
\| \| \| \| \| \| \| \|	Fairly simple, this uses the existing infrastructure to find opportunities to refine the type of a global variable. This a common pattern in j2wasm for example, where a global begins as a null of $java.lang.Object (the least specific type) but it is in practice always assigned an object of some specific type.
*	[NFC] HeapRefining => TypeRefining (#4332)	Alon Zakai	2021-11-16	1	-3/+3
\|
*	[NFC] Rename GlobalSubtyping => HeapRefining (#4331)	Alon Zakai	2021-11-16	1	-3/+3
\|
*	Add GlobalSubtyping pass (#4306)	Alon Zakai	2021-11-10	1	-0/+3
\| \| \| \| \| \| \| \| \|	This specializes the fields of structs based on the types written to them. That is, if a field is of type A but in practice we always write some subtype B to it then we can change the type of the field to that. On j2wasm this manages to improve at least one field in 2% of types. Not a large amount, but this does lead to further benefits in later opts (e.g. about a third of the improvements are to turn a field non-nullable).
*	Add a --structural flag (#4252)	Thomas Lively	2021-10-16	9	-171/+38
\| \| \| \| \| \| \| \| \|	Just as the --nominal flag forces all types to be parsed as nominal, the --structural flag forces all types to be parsed as equirecursive. This is the current default behavior, but a future PR will change the default to parse types as either structural or nominal according to their syntax or encoding. This new flag will then be necessary to get the current behavior. Also take this opportunity to deduplicate more flags in the help tests.
*	[Wasm GC] GlobalTypeOptimization: Turn fields immutable when possible (#4213)	Alon Zakai	2021-10-06	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new pass to perform global type optimization. So far this just does one thing, to find fields with no struct.set and to turn them immutable (where possible - sub and supertypes must agree). To do that, this adds a GlobalTypeRewriter utility which rewrites all the heap types in the module, allowing changes while doing so. In this PR, the change is to flip the mutable field. Otherwise, the utility handles all the boilerplate of creating temp heap types using a TypeBuilder, and it handles replacing the types in every place they are used in the module. This is not enabled by default yet as I don't see enough of a benefit on j2cl. This PR is basically the simplest thing to do in the space of global type optimization, and the simplest way I can think of to fully test the GlobalTypeRewriter (which can't be done as a unit test, really, since we want to emit a full module and validate it etc.). This PR builds the foundation for more complicated things like removing unused fields, subtyping fields, and more.
*	Disable partial inlining by default and add a flag for it. (#4191)	Alon Zakai	2021-09-27	1	-0/+5
\| \| \| \| \|	Locally I saw a 10% speedup on j2cl but reports of regressions have arrived, so let's disable it for now pending investigation. The option added here should make it easy to experiment.
*	[wasm-split] Disallow mixing --profile, --keep-funcs, and --split-funcs (#4187)	Thomas Lively	2021-09-24	1	-13/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the set of functions to keep was initially empty, then the profile added new functions to keep, then the --keep-funcs functions were added, then the --split-funcs functions were removed. This method of composing these different options was arbitrary and not necessarily intuitive, and it prevented reasonable workflows from working. For example, providing only a --split-funcs list would result in all functions being split out not matter which functions were listed. To make the behavior of these options, and --split-funcs in particular, more intuitive, disallow mixing them and when --split-funcs is used, split out only the listed functions.
*	Add feature flag for relaxed-simd (#4183)	Ng Zhi An	2021-09-23	2	-0/+8
\|
*	Add an Intrinsics mechanism, and a call.without.effects intrinsic (#4126)	Alon Zakai	2021-09-10	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An "intrinsic" is modeled as a call to an import. We could also add new IR things for them, but that would take more work and lead to less clear errors in other tools if they try to read a binary using such a nonstandard extension. A first intrinsic is added here, call.without.effects This is basically the same as call_ref except that the optimizer is free to assume the call has no side effects. Consequently, if the result is not used then it can be optimized out (as even if it is not used then side effects could have kept it around). Likewise, the lack of side effects allows more reordering and other things. A lowering pass for intrinsics is provided. Rather than automatically lower them to normal wasm at the end of optimizations, the user must call that pass explicitly. A typical workflow might be -O --intrinsic-lowering -O That optimizes with the intrinsic present - perhaps removing calls thanks to it - then lowers it into normal wasm - it turns into a call_ref - and then optimizes further, which would turns the call_ref into a direct call, potentially inline, etc.
*	[wasm-split] Add an option for recording profile data in memory (#4120)	Thomas Lively	2021-09-03	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To avoid requiring a static memory allocation, wasm-split's instrumentation defaults to recording profile data in Wasm globals. This causes problems for multithreaded applications because the globals are thread-local, but it is not always feasible to arrange for a separate profile to be dumped on each thread. To simplify the profiling of such multithreaded applications, add a new instrumentation mode that stores the profiling data in shared memory instead of in globals. This allows a single profile to be written that correctly reflects the called functions on all threads. This new mode is not on by default because it requires users to ensure that the program will not trample the in-memory profiling data. The data is stored beginning at address zero and occupies one byte per declared function in the instrumented module. Emscripten can be told to leave this memory free using the GLOBAL_BASE option.
*	Optimize away dominated calls to functions that run only once (#4111)	Alon Zakai	2021-09-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some functions run only once with this pattern: function foo() { if (foo$ran) return; foo$ran = 1; ... } If that global is not ever set to 0, then the function's payload (after the initial if and return) will never execute more than once. That means we can optimize away dominated calls: foo(); foo(); // we can remove this To do this, we find which globals are "once", which means they can fit in that pattern, as they are never set to 0. If a function looks like the above pattern, and it's global is "once", then the function is "once" as well, and we can perform this optimization. This removes over 8% of static calls in j2cl.
*	wasm-split: accept file in keep-funcs/split-funcs (#4053)	Aleksander Guryanov	2021-08-23	1	-2/+9
\|
*	TrapsNeverHappen mode (#4059)	Alon Zakai	2021-08-17	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The goal of this mode is to remove obviously-unneeded code like (drop (i32.load (local.get $x))) In general we can't remove it, as the load might trap - we'd be removing a side effect. This is fairly rare in general, but actually becomes quite annoying with wasm GC code where such patterns are more common, and we really need to remove them. Historically the IgnoreImplicitTraps option was meant to help here. However, in practice it did not quite work well enough for most production code, as mentioned e.g. in #3934 . TrapsNeverHappen mode is an attempt to fix that, based on feedback from @askeksa in that issue, and also I believe this implements an idea that @fitzgen mentioned a while ago (sorry, I can't remember where exactly...). So I'm hopeful this will be generally useful and not just for GC. The idea in TrapsNeverHappen mode is that traps are assumed to not actually happen at runtime. That is, if there is a trap in the code, it will not be reached, or if it is reached then it will not trap. For example, an (unreachable) would be assumed to never be reached, which means that the optimizer can remove it and any code that executes right before it: (if (..condition..) (block (..code that can be removed, if it does not branch out..) (..code that can be removed, if it does not branch out..) (..code that can be removed, if it does not branch out..) (unreachable))) And something like a load from memory is assumed to not trap, etc., which in particular would let us remove that dropped load from earlier. This mode should be usable in production builds with assertions disabled, if traps are seen as failing assertions. That might not be true of all release builds (maybe some use traps for other purposes), but hopefully in some. That is, if traps are like assertions, then enabling this new mode would be like disabling assertions in release builds and living with the fact that if an assertion would have been hit then that is "undefined behavior" and the optimizer might have removed the trap or done something weird. TrapsNeverHappen (TNH) is different from IgnoreImplicitTraps (IIT). The old IIT mode would just ignore traps when computing effects. That is a simple model, but a problem happens with a trap behind a condition, like this: if (x != 0) foo(1 / x); We won't trap on integer division by zero here only because of the guarding if. In IIT, we'd compute no side effects on 1 / x, and then we might end up moving it around, depending on other code in the area, and potentially out of the if - which would make it happen unconditionally, which would break. TNH avoids that problem because it does not simply ignore traps. Instead, there is a new hasUnremovableSideEffects() method that must be opted-in by passes. That checks if there are no side effects, or if there are, if we can remove them - and we know we can remove a trap if we are running under TrapsNeverHappen mode, as the trap won't happen by assumption. A pass must only use that method where it is safe, that is, where it would either remove the side effect (in which case, no problem), or if not, that it at least does not move it around (avoiding the above problem with IIT). This PR does not implement all optimizations possible with TNH, just a small initial set of things to get started. It is already useful on wasm GC code, including being as good as IIT on removing unnecessary casts in some cases, see the test suite updates here. Also, a significant part of the 18% speedup measured in #4052 (comment) is due to my testing with this enabled, as otherwise the devirtualization there leaves a lot of unneeded code.
*	Support nominal typing in wasm-reduce (#4080)	Alon Zakai	2021-08-16	1	-25/+114
\| \| \| \| \|	Use ToolOptions there, which adds --nominal support. We must also pass --nominal to the sub-commands we run.
*	[Wasm GC] Constant Field Propagation (#4052)	Alon Zakai	2021-08-05	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A field in a struct is constant if we can see that in the entire program we only ever write the same constant value to it. For example, imagine a vtable type that we construct with the same funcrefs each time, then (if we have no struct.sets, or if we did, and they had the same value), we could replace a get with that constant value, since it cannot be anything else: (struct.new $T (i32.const 10) (rtt)) ..no other conflicting values.. (struct.get $T 0) => (i32.const 10) If the value is a function reference, then this may allow other passes to turn what was a call_ref into a direct call and perhaps also get inlined, effectively a form of devirtualization. This only works in nominal typing, as we need to know the supertype of each type. (It could work in theory in structural, but we'd need to do hard work to find all the possible supertypes, and it would also become far less effective.) This deletes a trivial test for running -O on GC content. We have many more tests for GC at this point, so that test is not needed, and this PR also optimizes the code into something trivial and uninteresting anyhow.
*	[JS] Add a new OptimizeForJS pass (#4033)	Max Graey	2021-08-02	1	-0/+3
\| \| \| \| \|	Add a new OptimizeForJS pass which contains rewriting rules specific to JavaScript. LLVM usually lowers x != 0 && (x & (x - 1)) == 0 (isPowerOf2) to popcnt(x) == 1 which is ok for wasm and other targets but is quite expensive for JavaScript. In this PR we lower the popcnt pattern back to the isPowerOf2 pattern.
*	[Wasm GC] Local-Subtyping pass (#3765)	Alon Zakai	2021-07-23	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a local is say anyref, but all values assigned to it are something more specific like funcref, then we can make the type of the local more specific. In principle that might allow further optimizations, as the local.gets of that local will have a more specific type that their users can see, like this: (import .. (func $get-funcref (result funcref))) (func $foo (result i32) (local $x anyref) (local.set $x (call $get-funcref)) (ref.is_func (local.get $x)) ) => (func $foo (result i32) (local $x funcref) ;; updated to a subtype of the original (local.set $x (call $get-funcref)) (ref.is_func (local.get $x)) ;; this can now be optimized to "1" ) A possible downside is that using more specific types may not end up allowing optimizations but may end up increasing the size of the binary (say, replacing lots of anyref with various specific types that compress more poorly; also, for recursive types the LUB may be a unique type appearing nowhere else in the wasm). We should investigate the code size factors more later.
*	Lit tests for tool help messages (#3965)	Thomas Lively	2021-07-07	12	-0/+1089
	Add list tests for the help messages of all tools, factoring out common options into shared tests. This is slightly brittle because the text wrapping depends on the length of the longest option, but that brittleness should be worth the benefit of being able to see the actual help text in the tests.