summaryrefslogtreecommitdiff
path: root/src/passes/pass.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Redundant Set Elimination pass (#1344)Alon Zakai2018-01-051-0/+4
| | | | This optimizes #1343. It looks for stores of a value that is already present in the local, which in particular can remove the initial set to 0 of loops starting at zero, since all locals are initialized to that already. This helps in real-world code, but is not super-common since coalescing means we tend to have assigned something else to it anyhow before we need it to be zero, so this mainly helps in small functions (and running this before coalescing would extend live ranges in potentially bad ways).
* SpillPointers pass (#1339)Alon Zakai2017-12-301-0/+1
| | | | | | | | | | | | | This is an experiment to help with Boehm-style GC. It will spill things that could be pointers to the C stack, so that they can be seen by conservative garbage collection. The spills add code size and runtime overhead, but actually less than I thought: 10% slower (smaller than the difference between VMs), 15% gzip size larger. We can do even better with more optimizations for this, like a dead store elimination pass. This PR does the following: * Add the new pass. * Create an abi/ dir, with info about the pointer size and stack manipulation utilities. * Separates out the liveness analysis from CoalesceLocals, so that other passes can use it (like SpillPointers). * Refactor out the SortedVector class from the liveness analysis to a separate file (just seems nicer that way).
* merge-locals pass (#1334)Alon Zakai2017-12-171-0/+5
| | | | | | | | | This optimizes the situation described in #1331. Namely, when x is copied into y, then on subsequent gets of x we could use y instead, and vice versa, as their value is equal. Specifically, this seems to get rid of the definite overlap in the live ranges of x and y, as removing it allows coalesce-locals to merge them. The pass therefore does nothing if the live range of y ends there anyhow. The danger here is that we may extend the live range so that it causes more conflicts with other things, so this is a heuristic, but I've tested it on every codebase I can find and it always produces a net win, even on one I saw a 0.4% reduction of code size, which surprised me. This is a fairly slow pass, because it uses LocalGraph which isn't much optimized. This PR includes a minor optimization for it, but we should rewrite it. Meanwhile this is just enabled in -O3 and -Oz. This PR also includes some fuzzing improvements, to better test stuff like this.
* Restrict validation output to just validation errors in the API (#1253)Daniel Wirtz2017-11-011-0/+2
| | | Do not print the entire and possibly very large module when validation fails. Leave printing to tools using the validator, instead of always doing it in the validator where it can't be overridden.
* Add Features enum to IR (#1250)Derek Schuff2017-10-271-2/+2
| | | | | | | | | | | | This enum describes which wasm features the IR is expected to include. The validator should reject operations which require excluded features, and passes should avoid producing IR which requires excluded features. This makes it easier to catch possible errors in Binaryen producers (e.g. emscripten). Asm2wasm has a flag to enable or disable atomics. Other tools currently just accept all features (as, dis and opt are just for inspecting or modifying existing modules, so it would be annoying to have to use flags with those tools and I expect the risk of accidentally introducing atomics to be low).
* Flattening rewrite (#1201)Alon Zakai2017-10-031-1/+1
| | | | | | | | | | | | | | | | | | | | Rename flatten-control-flow to flatten, which now flattens everything, not just control flow, so e.g. (i32.add (call $x) (call $y) ) ==> (block (set_local $temp_x (call $x)) (set_local $temp_y (call $y)) (i32.add (get_local $x) (get_local $y) ) ) This uses more locals than before, but is much simpler and avoids a bunch of corner cases and fuzz bugs the old one hit. We can optimize later if necessary.
* Refactor validator API to use enums (#1209)Alon Zakai2017-10-031-2/+6
| | | | * refactor validator API to use enums
* Share trap mode between asm2wasm and s2wasm (#1168)jgravelle-google2017-10-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Extract Asm2WasmBuilder::TrapMode to shared FloatTrapMode * Extract makeTrappingI32Binary * Extract makeTrappingI64Binary * Extract asm2wasm test script into scripts/test/asm2wasm.py This matches s2wasm.py, and makes iterating on asm2wasm slightly faster. * Simplify callsites with an arg struct * Combine func adding across i32 and i64 * Support f32-to-int in asm2wasm * Add BinaryenTrapMode pass, run pass from s2wasm * BinaryenTrapMode pass takes trap context as a parameter * Pass fully supports non-trapping binary ops * Defer adding functions until after iteration (hackily) * Update asm2wasm to work with deferred function adding, rebuild tests * Extract makeTrappingFloatToInt32 * Extract makeTrappingFloatToInt64 * Add unary conversions to trap pass * Add functions in the pass itself * Set s2wasm trap mode with command-line arguments * Print BINARYEN_PASS_DEBUG state when testing * Get asm2wasm using the BinaryenTrapMode pass instead of handling it inline * Also handle f32 to int in asm2wasm * Make BinaryenTrapMode only need a FloatTrapMode from the caller * Just pass the current binary Expression directly * Combine makeTrappingI32Binary with makeTrappingI64Binary * Pass Unary expr to makeTrappingFloatToInt32 * Unify makeTrappingFloatToInt32 & 64 * Move makeTrapping* functions inside BinaryenTrapMode, make addedFunctions non-static * Remove FloatTrapContext * Minor cleanups * Extract some smaller subfunctions * Emit name switch/casing, rename is32Bit to isI64 for consistency * Rename BinaryenTrapMode to FloatTrap, make trap mode a nested enum * Add some comments explaining why FloatTrap is non-parallel * Rename addedFunctions to generatedFunctions for precision * Rename move and split float-clamp.h to passes/FloatTrap.(h|cpp) * Use builder instead of allocator * Instantiate trap handling passes via the pass manager * Move passes/FloatTrap.h to ast/trapping.h * Add helper function to add trap-handling passes * Add trap mode pass tests * Rename FloatTrap.cpp to TrapMode.cpp * Add s2wasm trap mode tests. Force float->int conversion to be signed * Add trapping_sint_div_s test to unit.asm.js * Fix flake8 issues with test scripts * Update pass description comment * Extract building functions methods * Make generate functions into top-level functions * Add GeneratedTrappingFunctions class to manage function/import additions * Move ensure/makeTrapping functions outside class scope * Use GeneratedTrappingFunctions to add immediately in asm2wasm mode * Remove trapping_sint_div_s test We only added it to test that trapping divisions would get constant-folded at the correct time. Now that we're not changing the timing of trapping modes, the test is unneeded (and problematic). * Review feedback, add validator/*.wasm to .gitignore * Add support for unsigned float-to-int conversion * Use opcode directly instead of bools * Update s2wasm clamp test for unsigned ftoi
* precompute-propagate pass (#1179)Alon Zakai2017-09-121-1/+7
| | | | | | | Implements #1172: this adds a variant of precompute, "precompute-propagate", which also does constant propagation. Precompute by itself just runs the interpreter on each expression and sees if it is in fact a constant; precompute-propagate also looks at the graph of connections between get and set locals, and propagates those constant values. This helps with cases as noticed in #1168 - while in most cases LLVM will do this already, it's important when inlining, e.g. inlining of the clamping math functions. This new pass is run when inlining, and otherwise only in -O3/-Oz, as it does increase compilation time noticeably if run on everything (and for almost no benefit if LLVM has run). Most of the code here is just refactoring out from the ssa pass the get/set graph computation, so it can now be used by both the ssa pass and precompute-propagate.
* Const hoisting (#1176)Alon Zakai2017-09-121-0/+1
| | | A pass that hoists repeating constants to a local, and replaces their uses with a get of that local. This can reduce binary size, but can also *increase* gzip size, so it's mostly for experimentation and not used by default.
* i64 to i32 lowering for wasm2asm (#1134)Thomas Lively2017-09-011-0/+1
|
* Safe heap pass (#1145)Alon Zakai2017-08-281-0/+1
| | | Adds --safe-heap which instruments the code to check heap loads and stores for validity (null pointer derefs, within range of valid sbrk memory, and alignment). Used in SAFE_HEAP in emscripten.
* Inline many (#1125)Alon Zakai2017-08-221-2/+2
| | | | | | | * Improve inlining pass to inline single-use functions that are fairly small, which makes it useful for removing unnecessary global constructors from clang. * Add an inlining-optimizing pass that also optimizes where it inlined, as new opportunities arise. enable that it by default in O2+ * In addition, in -O3+ also inline small functions with multiple uses. This helps a lot with things like safe-int-divide functions (where each int divide is replaced by a safe divide that won't trap). Inlining gets rid of around half of the overhead there.
* Improve and enable inlining pass (#966)Alon Zakai2017-08-071-5/+11
| | | | | | | | * improve inlining pass to inline single-use functions that are fairly small, which makes it useful for removing unnecessary global constructors from clang. add an inlining-optimizing pass that also optimizes where it inlined, as new opportunities arise. enable that it by default in O2+ * fix a bug where we didn't run all passes properly - refactor addDefaultGlobalOptimizationPasses() into a pre and post version. we can only run the post version in incremental optimizing builds (functions appear one by one, we optimize them first, and do global stuff when all are done), but can run both when doing a full optimize * copy in inlining, allowing multiple inlinings of the same function in the future
* Code folding (#1076)Alon Zakai2017-06-281-2/+5
| | | | | | | | | | | | | | | | Adds a pass that folds code, i.e. merges it when possible. See details in comment in the pass implementation cpp. This is enabled by default in -Os and -Oz. Seems risky to enable anywhere else, as it does add branches - likely predictable ones so maybe no slowdown, but still some risk. Code size numbers: wasm-backend: 196331 + binaryen -Os (before): 182598 + binaryen -Os (with folding): 181943 asm2wasm -Os (before): 172463 asm2wasm -Os (with folding): 168774 So this reduces wasm-backend output by an additional 0.5% than it could before. Mainly this is because the wasm backend already has code folding, whereas on asm2wasm output, where we didn't have folding before, this saves over 2%. The 0.5% improvement on the wasm backend's output might be because this can fold more types of code than LLVM can (it can fold nested control flow, in particular).
* Untee pass (#1053)Alon Zakai2017-06-141-1/+2
|
* SSA pass (#1049)Alon Zakai2017-06-131-0/+2
| | | | | | | * Add SSA pass which ensures a single assign for each local, except for merged locals where we ensure exactly a single assign from one of the paths leading to that use * Also add InstrumentLocals pass, useful for debugging locals (similar to InstrumentMemory but for locals) * Fix a PickLoadSigns bug with tees not being ignored, which was not noticed until now because we ran it on flatter output by default, but the ssa pass uncovered the bug
* Validate finalization (#1014)Alon Zakai2017-05-181-6/+7
| | | | | | | * validate that types are properly finalized, when in pass-debug mode (BINARYEN_PASS_DEBUG env var): check after each pass is run that the type of each node is equal to the proper type (when finalizing it, i.e., fully recomputing the type). * fix many fuzz bugs found by that. * in particular, fix dce bugs with type changes not being fully updated during code removal. add a new TypeUpdater helper class that lets a pass update types efficiently, by the helper tracking deps between blocks and branches etc., and updating/propagating type changes only as necessary.
* Re-reloop pass (#1009)Alon Zakai2017-05-161-11/+4
| | | | | This adds a pass that converts to a CFG, runs the relooper, and re-generates wasm from that. This depends on flatten-control-flow being run before. The main goal here is to help code generators other than asm2wasm (which already receives relooped code from fastcomp).
* merge blocks before and after remove-unused-brsAlon Zakai (kripken)2017-05-101-1/+2
|
* Flatten control flow pass (#999)Alon Zakai2017-05-101-0/+1
| | | | | | | | | | | This pass flattens out control flow in order to achieve 2 properties: * Control flow structures (block, loop, if) and control flow operations (br, br_if, br_table, return, unreachable) may only be block children, a loop body, or an if-true or if-false. (I.e., they cannot be nested inside an i32.add, a drop, a call, an if-condition, etc.) * Disallow block, loop, and if return values, i.e., do not use control flow to pass around values. As a result, expressions cannot contain control flow, and overall control flow is simpler, more structured, and more "flat". This should make things like re-relooping wasm code much easier, as they can run after the cfg is flattened
* Add pass to instrument loads / stores. (#959)Michael Bebenita2017-04-291-0/+1
| | | | | | | | * Add pass to instrument loads / stores * Simplify instrumentation. * Document.
* Preserve debug info through the optimizer (#981)Alon Zakai2017-04-281-1/+21
| | | | | | | | | | | | | | * add debugInfo option to passes, and use it to keep debug info alive through optimizations when we need it * add fib testcase for debug info * when preserving debug info, do not move code around call-imports, so debug info intrinsics remain stationary * improve wasm-module-building handling of the single-threaded case: don't create workers, which is more efficient and also nicer for debugging * process debug info in a more precise way, reordering it from being after the node (as it was a comment in JS) to before the node * remove unreachable hack for debug info, which is no longer needed since we reorder them, and make sure to finalize blocks in which we reorder
* add a pass to log execution traces via instrumenting the code (#950)Alon Zakai2017-03-161-0/+1
|
* Local CSE (#930)Alon Zakai2017-03-081-0/+5
| | | Simple local common subexpression elimination. Useful mostly to reduce code size (as VMs do GVN etc.). Enabled by default in -Oz.
* fix BINARYEN_PASS_DEBUG option (#908)Alon Zakai2017-02-231-5/+9
| | | | | * fix BINARYEN_PASS_DEBUG option * Add isNested property to passRunner
* finish PickLoadSigns passAlon Zakai2017-02-161-0/+3
|
* refactor sign/zero extension code into nice headers, and prepare ↵Alon Zakai2017-02-161-0/+1
| | | | PickLoadSigns pass
* add a RemoveUnusedModuleElements pass, and make LegalizeJSInterface create ↵Alon Zakai2016-12-071-3/+3
| | | | TempRet0 if needed (otherwise we might remove it before we use it)
* improve local simplication: simplify without if/block structure values ↵Alon Zakai2016-11-061-1/+2
| | | | before coalesce, so that coalesce can remove all copies, then do another pass of full simplification after it
* add variants of simplify-locals with and without teeing and structural optsAlon Zakai2016-11-051-0/+3
|
* remove-unused-brs after coalesce-localsAlon Zakai2016-11-041-1/+2
|
* add a pass to optimize memory segments, and pack memory in asm2wasmAlon Zakai2016-11-011-0/+3
|
* add an inlining pass (#814)Alon Zakai2016-10-291-0/+1
|
* Code pushing (#807)Alon Zakai2016-10-261-0/+4
| | | Push code forward, potentially letting it not execute
* Adds a pass to print call graphs in .dot (graphviz) format. (#794)Michael Bebenita2016-10-201-0/+1
|
* Pass options (#788)Alon Zakai2016-10-181-4/+4
| | | | * add PassOptions structure, and use it for new -Os param to wasm-opt
* Use steady_clock to measure code execution time (#776)Loo Rong Jie2016-10-171-3/+2
|
* reorder locals after simplify-locals, to remove unused locals before ↵Alon Zakai2016-10-161-0/+1
| | | | coalesce-locals, making it much faster (#783)
* run remove-unused-functions by defaultAlon Zakai2016-10-141-0/+2
|
* reuse code in add*PassesAlon Zakai2016-10-141-15/+1
|
* put heavy pass debugging operations behind BINARYEN_PASS_DEBUG (#755)Alon Zakai2016-10-111-2/+9
|
* passRunner debug and validation improvements (#726)Alon Zakai2016-10-021-6/+17
|
* asm2wasm i64 support (#723)Alon Zakai2016-09-301-0/+1
| | | | | | | | | | | | * support i64 intrinsics from fastcomp, adding --wasm-only flag * refactor callImport logic in asm2wasm to avoid recomputing wasm types again * legalize illegal i64 params in exports and imports * do safe i64 binary ops depending on precision * fix addVar, only assert on names if we are using a name
* add ExtractFunction passAlon Zakai2016-09-131-0/+1
|
* thread relooper jumpsAlon Zakai2016-09-121-0/+1
|
* refactor pass hooks, creating a proper way to run code before a pass is runAlon Zakai2016-09-121-0/+5
|
* validate in debug mode in passRunnerAlon Zakai2016-09-111-5/+14
|
* autodrop must be run before we optimize in asm2wasm, as otherwise its input ↵Alon Zakai2016-09-071-0/+3
| | | | | | is not yet valid then after finalizeCalls, we must autodrop again to drop things that finalizeCalls changed
* remove lower-if-else, as it's no longer neededAlon Zakai2016-09-071-1/+0
|