forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Remove closed world validation checks (#7019)	Alon Zakai	2024-10-18	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \|	These were added to avoid common problems with closed world mode, but in practice they are causing more harm than good, forcing users to work around them. In the meantime (until #6965), remove this validation to unblock current toolchain makers. Fix GlobalTypeOptimization and AbstractTypeRefining on issues that this uncovers: without this validation, it is possible to run them on more wasm files than before, hence these were not previously detected. They are bundled in this PR because their tests cannot validate before this PR.
*	[NFC] Standardize Super:: over super:: (#6920)	Alon Zakai	2024-09-10	1	-1/+1
\| \| \| \|	As the name of a class, uppercase seems better here.
*	Allow different arguments for multiple instances of a pass (#6687)	Christian Speckner	2024-07-15	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \|	Each pass instance can now store an argument for it, which can be different. This may be a breaking change for the corner case of running a pass multiple times and setting the pass's argument multiple times as well (before, the last pass argument affected them all; now, it affects the last instance only). This only affects arguments with the name of a pass; others remain global, as before (and multiple passes can read them, in fact). See the CHANGELOG for details. Fixes #6646
*	[StackIR] Run StackIR during binary writing and not as a pass (#6568)	Alon Zakai	2024-05-09	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we had passes --generate-stack-ir, --optimize-stack-ir, --print-stack-ir that could be run like any other passes. After generating StackIR it was stashed on the function and invalidated if we modified BinaryenIR. If it wasn't invalidated then it was used during binary writing. This PR switches things so that we optionally generate, optimize, and print StackIR only during binary writing. It also removes all traces of StackIR from wasm.h - after this, StackIR is a feature of binary writing (and printing) logic only. This is almost NFC, but there are some minor noticeable differences: 1. We no longer print has StackIR in the text format when we see it is there. It will not be there during normal printing, as it is only present during binary writing. (but --print-stack-ir still works as before; as mentioned above it runs during writing). 2. --generate/optimize/print-stack-ir change from being passes to being flags that control that behavior instead. As passes, their order on the commandline mattered, while now it does not, and they only "globally" affect things during writing. 3. The C API changes slightly, as there is no need to pass it an option "optimize" to the StackIR APIs. Whether we optimize is handled by --optimize-stack-ir which is set like other optimization flags on the PassOptions object, so we don't need the old option to those C APIs. The main benefit here is simplifying the code, so we don't need to think about StackIR in more places than just binary writing. That may also allow future improvements to our usage of StackIR.
*	Fix global effect computation with -O flags (#6211)	Alon Zakai	2024-01-09	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We tested --generate-global-effects --vacuum and such, but not --generate-global-effects -O3 or the other -O flags. Unfortunately, our targeted testing missed a bug because of that. Specifically, we have special logic for -O flags to make sure the passes they expand into run with the proper opt and shrink levels, but that logic happened to also interfere with global effect computation. It would also interfere with allowing GUFA info or other things to be stored on the side, which we've proposed. This PR fixes that + future issues. The fix is to just allow a pass runner to execute more than once. We thought to avoid that and assert against it to keep the model "hermetic" (you create a pass runner, you run the passes, and you throw it out), which feels nice in a way, but it led to the bug here, and I'm not sure it would prevent any other ones really. It is also more code. It is simpler to allow a runner to execute more than once, and add a method to clear it. With that, the logic for -O3 execution is both simpler and does not interfere with anything but the opt and shrink level flags: we create a single runner, give it the proper options, and then keep using that runner + those options as we go, normally.
*	Inlining: Inline trivial calls (#6143)	Alon Zakai	2023-12-05	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A trivial call is something like a function that just calls another immediately, function foo(x, y) { return bar(y, 15); } We can inline those and expect to benefit in most cases, though we might increase code size slightly. Hence it makes sense to inline such cases, even though in general we are careful and do not inline functions with calls in them; a "trampoline" like that likely has most of the work in the call itself, which we can avoid by inlining. Suggested based on findings in Java.
*	Add an "unsubtyping" optimization (#5982)	Thomas Lively	2023-10-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new pass that analyzes the module to find the minimal subtyping relation that is necessary to maintain the validity and semantics of the program and rewrites the types to use this minimal relation. Besides eliminating references to otherwise-unused intermediate types, this optimization should unlock significant additional optimizing power in other type optimizations that are constrained by having to maintain supertype validity, since after this new optimization there are fewer and more general supertypes. The analysis works by visiting each expression and module element to collect the subtypings that are required to maintain its validity, then, using that as a starting point, iteratively adding new subtypings required by type definitions and casts until reaching a fixed point.
*	Automatically discard global effects in the rare passes that add effects (#5999)	Alon Zakai	2023-10-06	1	-0/+7
\| \| \| \| \|	All logging/instrumentation passes need to do this, to avoid us using stale global effects that are too low (too high is not optimal either, but at least it cannot cause bugs).
*	Fix opt/shrink levels when running the optimizer multiple times, Part 2 (#5787)	Alon Zakai	2023-06-27	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \|	This is a followup to #5333 . That fixed the selection of which passes to run, but forgot to also fix the global state of the current optimize/shrink levels. This PR fixes that. As a result, running -O3 -Oz will now work as expected: the first -O3 will run the right passes (as #5333 fixed) and while running them, the global optimize/shrinkLevels will be -O3 (and not -Oz), which this PR fixes. A specific result of this is that -O3 -Oz used to inline less, since the invocation of inlining during -O3 thought we were optimizing for size. The new test verifies that we do fully inline in the first -O3 now.
*	Add a mechanism to skip a pass by name (#5448)	Alon Zakai	2023-01-24	1	-0/+5
\| \| \| \| \| \| \| \|	For example, -O3 --skip-pass=vacuum will run -O3 normally but it will not run the vacuum pass at all (which normally runs more than once in -O3).
*	wasm2js: Avoid emitting non-JS code during opt (#5378)	Will Cohen	2023-01-04	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	As noted in #4806, trying to optimize past level 0 can result in passes emitting non-JS code, which is then unable to be converted during final output. This commit creates a new targetJS option in PassOptions, which can be checked inside each pass where non-JS code might be emitted. This commit initially adds that logic to OptimizeInstructions, where this issue was first noticed.
*	Do not optimize public types (#5347)	Thomas Lively	2022-12-16	1	-16/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not optimize or modify public heap types in any way. Public heap types include the types of imported or exported functions, tables, globals, etc. This is important to maintain the public interface of a module and ensure it can still link interact as intended with the outside world. Also add validation error if we find any nontrivial public types that are not the types of imported or exported functions. This error is meant to help the user ensure that type optimizations are not silently inhibited. In the future, we may want to add options to silence this error or downgrade it to a warning. This commit only updates the type updating machinery to avoid updating public types. It does not update any optimization passes accordingly. Since we avoid modifying public signature types already, this is not expected to break anything, but in the future once we have function subtyping or if we make the error optional, we may have to update some of our optimization passes.
*	Properly use pass options in nested pass runners (up to -O1) (#5351)	Alon Zakai	2022-12-15	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a TODO. There is a runtime cost to this in higher opt levels, as passing through -O3 makes nested optimization work take longer. But it can lead to better results. For now, this PR moves us from 0 before to a maximum of 1, as a compromise. 1 does not regress compile times, but there may be further benefits to allowing 2 and 3 in the future. Also fix a fuzzer bug that becomes uncovered by tihs PR: Now that we actually optimize in simplify-globals, we need to handle the case of the optimizer there seeing a call with the effects of writing to a global (we had an assert on that never happening, but with function effects that can happen, and so a GlobalSet is not the only thing that can set a global). Aside from the opt and shrink levels this passes through all other options, like trapsNeverHappen.
*	[NFC] Add some notes to pass.h on closed world (#5345)	Alon Zakai	2022-12-14	1	-0/+22
\|
*	Optimize Asyncify to not flatten/optimize unnecessarily (#5293)	Alexander Guryanov	2022-12-06	1	-2/+5
\| \| \| \| \| \| \| \| \|	Add a way to proxy passes and the addition of passes in pass runners. With that we can make Asyncify only modify functions it actually needs to. On a project that Asyncify only needs to modify a few functions on, this can save a huge amount of time as it avoids flattening+optimizing the majority of the module. Fixes #4822
*	[Wasm GC] Implement closed-world flag (#5303)	Alon Zakai	2022-11-30	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	With this change we default to an open world, that is, we do the safe thing by default: we no longer assume a closed world. Users that want a closed world must pass --closed-world. Atm we just do not run passes that assume a closed world. (We might later refine them to find which types don't escape and only optimize those.) The RemoveUnusedModuleElements is an exception in that the closed-world flag influences one part of its operation, but not the rest. Fixes #5292
*	Add `hasArgument` helper to pass options. NFC (#5278)	Sam Clegg	2022-11-17	1	-3/+5
\|
*	Switch from `typedef` to `using` in C++ code. NFC (#5258)	Sam Clegg	2022-11-15	1	-2/+2
\| \| \| \|	This is more modern and (IMHO) easier to read than that old C typedef syntax.
*	Refactor interaction between Pass and PassRunner (#5093)	Thomas Lively	2022-09-30	1	-27/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously only WalkerPasses had access to the `getPassRunner` and `getPassOptions` methods. Move those methods to `Pass` so all passes can use them. As a result, the `PassRunner` passed to `Pass::run` and `Pass::runOnFunction` is no longer necessary, so remove it. Also update `Pass::create` to return a unique_ptr, which is more efficient than having it return a raw pointer only to have the `PassRunner` wrap that raw pointer in a `unique_ptr`. Delete the unused template `PassRunner::getLast()`, which looks like it was intended to enable retrieving previous analyses and has been in the code base since 2015 but is not implemented anywhere.
*	[NFC] Simplify traversal code for setting the module (#5082)	Alon Zakai	2022-09-26	1	-4/+1
\| \| \| \|	Make walkModuleCode set the module automatically, like walkModule already does. Also remove some unneeded module settings when calling those methods.
*	Effects: Clarify trap effect meaning, and consider infinite loops to trap ↵	Alon Zakai	2022-09-16	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	due to timeout (#5039) I think this simplifies the logic behind what we consider to trap. Before we had kind of a hack in visitLoop that now has a more clear reasoning behind it: we consider as trapping things that trap in all VMs all the time, or will eventually. So a single allocation doesn't trap, but an unbounded amount can, and an infinite loop is considered to trap as well (a timeout in a VM will be hit eventually, somehow). This means we cannot optimize way a trivial infinite loop with no effects in it, while (1) {} But we can optimize it out in trapsNeverHappen mode. In any event, such a loop is not a realistic situation; an infinite loop with some other effect in it, like a call to an import, will not be optimized out, of course. Also clarify some other things regarding traps and trapsNeverHappen following recent discussions in https://github.com/emscripten-core/emscripten/issues/17732 Specifically, TNH will never be allowed to remove calls to imports.
*	Allow optimizing with global function effects (#5040)	Alon Zakai	2022-09-16	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a map of function name => the effects of that function to the PassOptions structure. That lets us compute those effects once and then use them in multiple passes afterwards. For example, that lets us optimize away a call to a function that has no effects: (drop (call $nothing)) [..] (func $nothing ;; .. lots of stuff but no effects, only a returned value .. ) Vacuum will remove that dropped call if we tell it that the called function has no effects. Note that a nice result of adding this to the PassOptions struct is that all passes will use the extra info automatically. This is not enabled by default as the benefits seem rather minor, though it does help in a small but noticeable way on J2Wasm code, where we use call.without.effects and have situations like this: (func $foo (call $bar) ) (func $bar (call.without.effects ..) ) The call to bar looks like it has effects, normally, but with global effect info we know it actually doesn't. To use this, one would do --generate-global-effects [.. some passes that use the effects ..] --discard-global-effects Discarding is not necessary, but if there is a pass later that adds effects, then not discarding could lead to bugs, since we'd think there are fewer effects than there are. (However, normal optimization passes never add effects, only remove them.) It's also possible to call this multiple times: --generate-global-effects -O3 --generate-global-effects -O3 That computes affects after the first -O3, and may find fewer effects than earlier. This doesn't compute the full transitive closure of the effects across functions. That is, when computing a function's effects, we don't look into its own calls. The simple case so far is enough to handle the call.without.effects example from before (though it may take multiple optimization cycles).
*	[Wasm GC] Support non-nullable locals in the "1a" form (#4959)	Alon Zakai	2022-08-31	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An overview of this is in the README in the diff here (conveniently, it is near the top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps things easy to reason about - what validates is what is valid wasm - but there are some minor nuances as mentioned there, in particular, we ignore nameless blocks (which are commonly added by various passes; ignoring them means we can keep more locals non-nullable). The key addition here is LocalStructuralDominance which checks which local indexes have the "structural dominance" property of 1a, that is, that each get has a set in its block or an outer block that precedes it. I optimized that function quite a lot to reduce the overhead of running that logic after each pass. The overhead is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we shrink code size, so there is less work actually, and it balances out). Since we run fixups after each pass, this PR removes logic to manually call the fixup code from various places we used to call it (like eh-utils and various passes). Various passes are now marked as requiresNonNullableLocalFixups => false. That lets us skip running the fixups after them, which we normally do automatically. This helps avoid overhead. Most passes still need the fixups, though - any pass that adds a local, or a named block, or moves code around, likely does. This removes a hack in SimplifyLocals that is no longer needed. Before we worked to avoid moving a set into a try, as it might not validate. Now, we just do it and let fixups happen automatically if they need to: in the common code they probably don't, so the extra complexity seems not worth it. Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a nondefaultable local. But we have the logic to fix that up now, and opts will likely keep it non-nullable as well. Various tests end up updated here because now a local can be non-nullable - previous fixups are no longer needed. Note that this doesn't remove the gc-nn-locals feature. That has been useful for testing, and may still be useful in the future - it basically just allows nn locals in all positions (that can't read the null default value at the entry). We can consider removing it separately. Fixes #4824
*	Validator: Validate globally by default (#4906)	Alon Zakai	2022-08-16	1	-1/+1
\| \| \| \| \| \|	I'm not sure why this defaulted to non-global. Perhaps because of limitations in the asm.js days. A better default is to validate globally, and this also applies in pass-debug mode (since that just uses the default there), so this will catch more problems there.
*	Clarify in tools help message that -O == -Os. (#4516)	t4lz	2022-02-16	1	-3/+6
\| \| \| \| \|	Introduce static consts with PassOptions Defaults. Add assertion to verify that the default options are the Os options. Also update the text in relevant tests.
*	Add fixup function for nested pops in catch (#4348)	Heejin Ahn	2021-11-22	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds `EHUtils::handleBlockNestedPops`, which can be called at the end of passes that has a possibility to put `pop`s inside `block`s. This method assumes there exists a `pop` in a first-descendant line, even though it can be nested within a block. This allows a `pop` to be nested within a `block` or a `try`, but not a `loop`, since that means the `pop` can run multile times. In case of `if`, `pop` can exist only in its condition; if a `pop` is in its true or false body, that's not in the first-descendant line. This can be useful when optimization passes create blocks to do transformations. Wrapping expressions wiith a block does not change semantics most of the time, but if pops happen to be inside a block generated by those passes, they can result in invalid binaries. To test this, this adds `passes/test_passes.cpp`, which is intended to contain multiple test passes that test a single (or more) utility functions separately. Without this kind of pass, it is hard to test various cases in which nested `pop`s can be generated in existing passes. This PR also adds `PassRegistry::registerTestPass`, which registers a pass that's intended only for internal testing and does not show up in `wasm-opt --help`. Fixes #4237.
*	Add runOnModuleCode helper. NFC (#4234)	Alon Zakai	2021-10-11	1	-1/+7
\| \| \| \| \| \| \| \| \|	This method is in parallel to runOnFunction above it. It sets the runner and then does the walk, like that method. Also set runner to nullptr by default. I noticed ubsan was warning on things here, which this should avoid, but otherwise I'm not aware of an actual bug, so this should be NFC. But it does provide a safer API that should avoid future bugs.
*	Disable partial inlining by default and add a flag for it. (#4191)	Alon Zakai	2021-09-27	1	-0/+5
\| \| \| \| \|	Locally I saw a 10% speedup on j2cl but reports of regressions have arrived, so let's disable it for now pending investigation. The option added here should make it easy to experiment.
*	Deprecate IgnoreImplicitTraps (#4087)	Alon Zakai	2021-08-17	1	-0/+1
\|
*	TrapsNeverHappen mode (#4059)	Alon Zakai	2021-08-17	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The goal of this mode is to remove obviously-unneeded code like (drop (i32.load (local.get $x))) In general we can't remove it, as the load might trap - we'd be removing a side effect. This is fairly rare in general, but actually becomes quite annoying with wasm GC code where such patterns are more common, and we really need to remove them. Historically the IgnoreImplicitTraps option was meant to help here. However, in practice it did not quite work well enough for most production code, as mentioned e.g. in #3934 . TrapsNeverHappen mode is an attempt to fix that, based on feedback from @askeksa in that issue, and also I believe this implements an idea that @fitzgen mentioned a while ago (sorry, I can't remember where exactly...). So I'm hopeful this will be generally useful and not just for GC. The idea in TrapsNeverHappen mode is that traps are assumed to not actually happen at runtime. That is, if there is a trap in the code, it will not be reached, or if it is reached then it will not trap. For example, an (unreachable) would be assumed to never be reached, which means that the optimizer can remove it and any code that executes right before it: (if (..condition..) (block (..code that can be removed, if it does not branch out..) (..code that can be removed, if it does not branch out..) (..code that can be removed, if it does not branch out..) (unreachable))) And something like a load from memory is assumed to not trap, etc., which in particular would let us remove that dropped load from earlier. This mode should be usable in production builds with assertions disabled, if traps are seen as failing assertions. That might not be true of all release builds (maybe some use traps for other purposes), but hopefully in some. That is, if traps are like assertions, then enabling this new mode would be like disabling assertions in release builds and living with the fact that if an assertion would have been hit then that is "undefined behavior" and the optimizer might have removed the trap or done something weird. TrapsNeverHappen (TNH) is different from IgnoreImplicitTraps (IIT). The old IIT mode would just ignore traps when computing effects. That is a simple model, but a problem happens with a trap behind a condition, like this: if (x != 0) foo(1 / x); We won't trap on integer division by zero here only because of the guarding if. In IIT, we'd compute no side effects on 1 / x, and then we might end up moving it around, depending on other code in the area, and potentially out of the if - which would make it happen unconditionally, which would break. TNH avoids that problem because it does not simply ignore traps. Instead, there is a new hasUnremovableSideEffects() method that must be opted-in by passes. That checks if there are no side effects, or if there are, if we can remove them - and we know we can remove a trap if we are running under TrapsNeverHappen mode, as the trap won't happen by assumption. A pass must only use that method where it is safe, that is, where it would either remove the side effect (in which case, no problem), or if not, that it at least does not move it around (avoiding the above problem with IIT). This PR does not implement all optimizations possible with TNH, just a small initial set of things to get started. It is already useful on wasm GC code, including being as good as IIT on removing unnecessary casts in some cases, see the test suite updates here. Also, a significant part of the 18% speedup measured in #4052 (comment) is due to my testing with this enabled, as otherwise the devirtualization there leaves a lot of unneeded code.
*	Do not attempt to preserve DWARF if a previous pass removes it (#3887)	Alon Zakai	2021-05-17	1	-2/+14
\| \| \| \| \| \| \| \| \| \| \|	If we run a pass that removes DWARF followed by one that could destroy it, then there is no possible problem - there is nothing left to destroy. We can run the later pass with no issues (and no warnings). Also add an assertion on running a pass runner only once. That has always been the assumption, and now that we track whether the added passes remove debug info, we need to check it. Fixes emscripten-core/emscripten#14161
*	Inlining: Always inline single-use functions (#3730)	Alon Zakai	2021-03-29	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements emscripten-core/emscripten#13744 Inlining functions with a single use allows us to remove the function afterward. That looks highly beneficial, shrinking every single benchmark in emscripten's benchmark suite, by an average of 2% on the macrobenchmarks and 3.5% on all of them. Speed also improves, although mostly on the microbenchmarks so that might be less realistic. There may be a slight downside to startup time due to emitting larger functions, but given the baseline compilers in VMs these days it seems worth it, as the delay would be just to get to the upper tier. On the benchmark suite the risk seems low. See more details in the PR above.
*	Poppify pass (#3541)	Thomas Lively	2021-02-09	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds a Poppify ("--poppify") pass for converting normal Binaryen IR to Poppy IR. Like the existing construction of Stacky IR, Poppify depends on the BinaryenIRWriter to drive the emitting of instructions in correct stack machine order. As instructions are "emitted," Poppify replaces their children with pops and collects them in a list. At the end of each scope, Poppify creates a block containing all the collected instructions for that scope and injects that block into the enclosing scope. All tuple globals and instructions dealing with tuples are also expanded to remove all tuples from the program. The validator currently fails to validate many valid Poppy IR patterns produced in the tests, but fixing that is left as follow-on work to keep this PR focused on the Poppify pass itself. For now the tests simply skip validation.
*	Warn when running a pass not compatible with DWARF (#3506)	Alon Zakai	2021-01-26	1	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Previously the addDefault* methods would avoid adding opt passes that we know are incompatible with DWARF. However, that didn't handle the case of passes that are added in other ways. For example, when running Asyncify, emcc will run --flatten before, and that pass is not compatible with DWARF. This PR lets us warn on that by annotating the passes themselves. Then we use those annotation to either not run a pass at all (matching the previous behavior) or to show a warning when necessary. Fixes emscripten-core/emscripten#13288 . That is, concretely after this PR running asyncify + DWARF will show a warning to the user.
*	Remove unused Pass::prepareToRun (#3386)	Thomas Lively	2020-11-18	1	-4/+0
\|
*	Inlining: Slight reordering of options (#3308)	Alon Zakai	2020-11-04	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This simplifies the three size-related inlining flags, so that their meanings are clearer. Also improve the comments and put them in a consistent order in both files. This should not make any difference in general, except that it is now possible to have oneCallerInlineMaxSize > flexibleInlineMaxSize, and we will inline a function with one caller if flexibleInlineMaxSize < FUNCTION_SIZE <= oneCallerInlineMaxSize which we would not before. As the defaults of the flags didn't fit that case, this should not change anything for anyone not passing in those specific inlining flags. For people that pass in those flags, this PR makes more things possible. Resolves the FIXME in that code, and is a refactoring before some more inlining work I have planned.
*	Rename unmodifiedImportedMemory => zeroFilledMemory (#3285)	Alon Zakai	2020-10-27	1	-4/+4
\|
*	Add unmodifiedImportedMemory pass option (#3246)	Alon Zakai	2020-10-16	1	-0/+9
\| \| \| \| \| \| \| \| \|	When set, we can assume an imported memory was not modified before us. That lets us assume it is all zeros and so we can optimize out zeros from memory segments. This does not actually do anything with the flag, just adds it. This is to avoid a rolling problem. Next emscripten can emit it without erroring, and then we can start to use it.
*	Log nested pass names in BINARYEN_PASS_DEBUG=2 (#3214)	Alon Zakai	2020-10-15	1	-3/+5
\| \| \| \|	We can't validate or print out the wasm in that case, but at least logging the names as they run can help debug some situations.
*	Add --fast-math mode (#3155)	Alon Zakai	2020-09-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Similar to clang and gcc, --fast-math makes us ignore corner cases of floating-point math like NaN changes and (not done yet) lack of associativity and so forth. In the future we may want to have separate fast math flags for each specific thing, like gcc and clang do. This undoes some changes (#2958 and #3096) where we assumed it was ok to not change NaN bits, but @binji corrected us. We can only do such things in fast math mode. This puts those optimizations behind that flag, adds tests for it, and restores the interpreter to the simpler code from before with no special cases.
*	Improve inlining "heavyweight" (#3085)	Max Graey	2020-09-04	1	-3/+3
\| \| \| \| \| \| \| \| \|	Split that mode into an option to check for loops (which indicate a function is "heavy") and a constant check for having calls. The case of calls is different as we would need more logic to avoid infinite recursion if we are willing to inling functions with calls. Practically, this renames allowHeavyweight to allowFunctionsWithLoops.
*	Add allowHeavyweight inlining option (#3032)	Max Graey	2020-08-26	1	-0/+3
\| \| \| \| \|	As discussed in #2921, this allows inlining of functions not identified as "lightweight" (that include a loop, for example).
*	Move stack-check into its own pass (#2994)	Sam Clegg	2020-07-27	1	-1/+0
\| \| \| \| \|	This new pass takes an optional stack-check-handler argument which is the name of the function to call on stack overflow. If no argument is passed then it just traps.
*	Add string parameter to WASM_UNREACHABLE (#2499)	Sam Clegg	2019-12-05	1	-3/+5
\| \| \| \| \|	This works more like llvm's unreachable handler in that is preserves information even in release builds.
*	Remove FunctionType from Event (#2466)	Thomas Lively	2019-11-25	1	-1/+1
\| \| \| \| \| \| \| \| \|	This is the start of a larger refactoring to remove FunctionType entirely and store types and signatures directly on the entities that use them. This PR updates BrOnExn and Events to remove their use of FunctionType and makes the BinaryWriter traverse the module and collect types rather than using the global FunctionType list. While we are collecting types, we also sort them by frequency as an optimization. Remaining uses of FunctionType in Function, CallIndirect, and parsing will be removed in a future PR.
*	Support --pass-arg in ToolOptions. (#2429)	Alon Zakai	2019-11-11	1	-0/+1
\| \| \| \| \| \|	This will allow us to pass pass args to wasm-emscripten-finalize, which runs legalize-js-interface internally, which recently added an optional argument.
*	Simpify PassRunner.add() and automatically parallelize parallel functions ↵	Alon Zakai	2019-07-19	1	-8/+19
\| \| \| \| \| \| \| \| \|	(#2242) Main change here is in pass.h, everything else is changes to work with the new API. The add("name") remains as before, while the weird variadic add(..) which constructed the pass now just gets a std::unique_ptr of a pass. This also makes the memory management internally fully automatic. And it makes it trivial to parallelize WalkerPass::run on parallel passes. As a benefit, this allows removing a lot of code since in many cases there is no need to create a new pass runner, and running a pass can be just a single line.
*	Allows multiple arguments to be passed to PassRunner::add<T>() (#2208)	Ryoga	2019-07-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct FooPass : public wasm::Pass { FooPass(int a, int b); }; PassRunner runner {module}; runner.add<FooPass>(1, 2); // To allow this This change avoids unnecessary copying and allows us to pass the reference without reference_wrapper. struct BarPass : public wasm::Pass { BarPass(std::ostream& s); }; runner.add<BarPass>(std::cout); // Error (cout is uncopyable) runner.add<BarPass>(std::ref(std::cout)); // OK ↓ runner.add<BarPass>(std::cout); // OK (passed by reference) runner.add<BarPass>(std::ref(std::cout)); // OK
*	Bysyncify: async transform for wasm (#2172)	Alon Zakai	2019-06-15	1	-0/+7
\| \| \| \| \| \| \| \| \|	This adds a new pass, Bysyncify, which transforms code to allow unwind and rewinding the call stack and local state. This allows things like coroutines, turning synchronous code asynchronous, etc. The new pass file itself has a large comment on top with docs. So far the tests here seem to show this works, but this hasn't been tested heavily yet. My next step is to hook this up to emscripten as a replacement for asyncify/emterpreter, see emscripten-core/emscripten#8561 Note that this is completely usable by itself, so it could be useful for any language that needs coroutines etc., and not just ones using LLVM and/or emscripten. See docs on the ABI in the pass source.
*	Inlining: exposed inlining thresholds as command-line parameters. (#2125)	Wouter van Oortmerssen	2019-05-23	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Inlining: exposed inlining thresholds as command-line parameters. This will allow easier experimentation with optimal settings. Also tweaked the default logic slightly to always inline single caller functions up to a certain size. The command-line arguments were tested to have the desired effect for example by the Makefile change in this commit: https://github.com/aardappel/lobster/commit/39ae393e27ff363ab095bbb26c90d6fe17570104 which in turn relies on: https://github.com/emscripten-core/emscripten/pull/8635 * Grouped inlining options & reverted defaults. Now uses same defaults for inlining as before for the sake of not having to redo a lot of tests. Added FIXME to indicate that the current inlining logic needs fixing. * Fixed default values now pulled from code. * clang-format