summaryrefslogtreecommitdiff
path: root/src/pass.h
Commit message (Collapse)AuthorAgeFilesLines
* Add string parameter to WASM_UNREACHABLE (#2499)Sam Clegg2019-12-051-3/+5
| | | | | This works more like llvm's unreachable handler in that is preserves information even in release builds.
* Remove FunctionType from Event (#2466)Thomas Lively2019-11-251-1/+1
| | | | | | | | | This is the start of a larger refactoring to remove FunctionType entirely and store types and signatures directly on the entities that use them. This PR updates BrOnExn and Events to remove their use of FunctionType and makes the BinaryWriter traverse the module and collect types rather than using the global FunctionType list. While we are collecting types, we also sort them by frequency as an optimization. Remaining uses of FunctionType in Function, CallIndirect, and parsing will be removed in a future PR.
* Support --pass-arg in ToolOptions. (#2429)Alon Zakai2019-11-111-0/+1
| | | | | | This will allow us to pass pass args to wasm-emscripten-finalize, which runs legalize-js-interface internally, which recently added an optional argument.
* Simpify PassRunner.add() and automatically parallelize parallel functions ↵Alon Zakai2019-07-191-8/+19
| | | | | | | | | (#2242) Main change here is in pass.h, everything else is changes to work with the new API. The add("name") remains as before, while the weird variadic add(..) which constructed the pass now just gets a std::unique_ptr of a pass. This also makes the memory management internally fully automatic. And it makes it trivial to parallelize WalkerPass::run on parallel passes. As a benefit, this allows removing a lot of code since in many cases there is no need to create a new pass runner, and running a pass can be just a single line.
* Allows multiple arguments to be passed to PassRunner::add<T>() (#2208)Ryoga2019-07-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | struct FooPass : public wasm::Pass { FooPass(int a, int b); }; PassRunner runner {module}; runner.add<FooPass>(1, 2); // To allow this This change avoids unnecessary copying and allows us to pass the reference without reference_wrapper. struct BarPass : public wasm::Pass { BarPass(std::ostream& s); }; runner.add<BarPass>(std::cout); // Error (cout is uncopyable) runner.add<BarPass>(std::ref(std::cout)); // OK ↓ runner.add<BarPass>(std::cout); // OK (passed by reference) runner.add<BarPass>(std::ref(std::cout)); // OK
* Bysyncify: async transform for wasm (#2172)Alon Zakai2019-06-151-0/+7
| | | | | | | | | This adds a new pass, Bysyncify, which transforms code to allow unwind and rewinding the call stack and local state. This allows things like coroutines, turning synchronous code asynchronous, etc. The new pass file itself has a large comment on top with docs. So far the tests here seem to show this works, but this hasn't been tested heavily yet. My next step is to hook this up to emscripten as a replacement for asyncify/emterpreter, see emscripten-core/emscripten#8561 Note that this is completely usable by itself, so it could be useful for any language that needs coroutines etc., and not just ones using LLVM and/or emscripten. See docs on the ABI in the pass source.
* Inlining: exposed inlining thresholds as command-line parameters. (#2125)Wouter van Oortmerssen2019-05-231-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | * Inlining: exposed inlining thresholds as command-line parameters. This will allow easier experimentation with optimal settings. Also tweaked the default logic slightly to always inline single caller functions up to a certain size. The command-line arguments were tested to have the desired effect for example by the Makefile change in this commit: https://github.com/aardappel/lobster/commit/39ae393e27ff363ab095bbb26c90d6fe17570104 which in turn relies on: https://github.com/emscripten-core/emscripten/pull/8635 * Grouped inlining options & reverted defaults. Now uses same defaults for inlining as before for the sake of not having to redo a lot of tests. Added FIXME to indicate that the current inlining logic needs fixing. * Fixed default values now pulled from code. * clang-format
* clang-tidy braces changes (#2075)Alon Zakai2019-05-011-1/+2
| | | Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
* Apply format changes from #2048 (#2059)Alon Zakai2019-04-261-58/+48
| | | Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
* Move features from passOptions to Module (#2001)Thomas Lively2019-04-121-5/+0
| | | | | This allows us to emit a (potentially modified) target features section and conditionally emit other sections such as the DataCount section based on the presence of features.
* Add a mechanism to pass arguments to passes (#1941)Alon Zakai2019-04-031-0/+9
| | | | | | | | | This allows wasm-opt --pass-arg=KEY:VALUE where KEY and VALUE are strings. It is then added to passOptions.arguments, where passes can read it. This is used in ExtractFunction instead of an env var.
* Consistently optimize small added constants into load/store offsets (#1924)Alon Zakai2019-03-011-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | See #1919 - we did not do this consistently before. This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten. Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this: y = x + 10 [..] load(y) [..] load(y) => y = x + 10 [..] load(x, offset=10) [..] load(x, offset=10) That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state. The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe. This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023. This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
* Code style improvements (#1868)Alon Zakai2019-01-151-3/+3
| | | | * Use modern T p = v; notation to initialize class fields * Use modern X() = default; notation for empty class constructors
* Feature options (#1797)Thomas Lively2018-12-031-1/+1
| | | | Add feature flags and struct interface. Default feature set has all feature enabled.
* Use `= default` rather than {} for empty destructors (#1794)Sam Clegg2018-12-031-1/+1
|
* standardize on 'template<' over 'template <' (i.e., remove a space) (#1782)Alon Zakai2018-11-291-1/+1
|
* Run legalize-js-interface during wasm-emscripten-finalize (#1653)Sam Clegg2018-08-291-3/+6
| | | | This ensures that 64-bit values are correctly handled on the JS boundary.
* Stack IR (#1623)Alon Zakai2018-07-301-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format. This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented, DCE local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get. Block removal - remove any blocks with no branches, as they are valid in wasm binary format. Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR. Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes: Generate stack IR Optimize stack IR (might be worth splitting out into separate passes eventually) Print stack IR for debugging purposes Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR. Miscellaneous notes: Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways. Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present. A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above. The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed. Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid. Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
* Clarify what function-parallel passes can do, and fix an asm2wasm bug (#1627)Alon Zakai2018-07-231-0/+5
| | | | | The problem this fixes is that we made precompute look at globals in #1622, while asm2wasm was creating globals while adding functions and optimizing them - which could race. This was caught by threadSanitizer (with low frequency, so we missed it on the initial landing). The underlying issue is that function-parallel passes should be able to read global state, just not modify it, and not read other functions' contents (which is why the Call node has a name, not a pointer to a function). This PR clarifies that in the docs, and fixes asm2wasm by not handling function bodies in parallel to creating globals.
* Clean up printing code (#1548)Alon Zakai2018-05-151-16/+0
| | | | | * make the iostream overrides receive a reference, not a pointer (i.e., like e.g. LLVM IR printing works, and avoiding overriding printing of pointer addresses which is sort of odd) * move more code out of headers, especially unrelated headers.
* add a --no-validation option to the commandline tools. disabling validation ↵Alon Zakai2018-04-091-0/+1
| | | | makes loading large wasm files more than twice as fast (#1496)
* Refactor optimization defaults (#1366)Alon Zakai2018-01-171-0/+12
| | | | | Followup to #1357. This moves the optimization settings into pass.h, and uses it from there in the various places. This also splits up huge lines from the tracing code, which put all block children (whose number can be arbitrarily large) on one line. This seems to have caused random errors on the bots, I suspect from overflowing a buffer. Anyhow, it's much more clear to split the lines at a reasonable length.
* runFunction => runOnFunction (we run on the function, not run the function) ↵Alon Zakai2018-01-101-3/+3
| | | | (#1356)
* Add Features enum to IR (#1250)Derek Schuff2017-10-271-0/+4
| | | | | | | | | | | | This enum describes which wasm features the IR is expected to include. The validator should reject operations which require excluded features, and passes should avoid producing IR which requires excluded features. This makes it easier to catch possible errors in Binaryen producers (e.g. emscripten). Asm2wasm has a flag to enable or disable atomics. Other tools currently just accept all features (as, dis and opt are just for inspecting or modifying existing modules, so it would be annoying to have to use flags with those tools and I expect the risk of accidentally introducing atomics to be low).
* Avoid returning a PassRunner just for OptimizationOptions (#1234)Alon Zakai2017-10-201-0/+4
| | | | | | | | * avoid returning a PassRunner just for OptimizationOptions, it would need a more careful design with a copy constructor. instead, just simplify the API to do the thing we need, which is run the passes * disallow copy constructor * delete copy operator too
* Add a superclass typedef to WalkerPass to simplify overrides (#1211)jgravelle-google2017-10-041-0/+3
|
* Improve and enable inlining pass (#966)Alon Zakai2017-08-071-2/+8
| | | | | | | | * improve inlining pass to inline single-use functions that are fairly small, which makes it useful for removing unnecessary global constructors from clang. add an inlining-optimizing pass that also optimizes where it inlined, as new opportunities arise. enable that it by default in O2+ * fix a bug where we didn't run all passes properly - refactor addDefaultGlobalOptimizationPasses() into a pre and post version. we can only run the post version in incremental optimizing builds (functions appear one by one, we optimize them first, and do global stuff when all are done), but can run both when doing a full optimize * copy in inlining, allowing multiple inlinings of the same function in the future
* Code folding (#1076)Alon Zakai2017-06-281-23/+0
| | | | | | | | | | | | | | | | Adds a pass that folds code, i.e. merges it when possible. See details in comment in the pass implementation cpp. This is enabled by default in -Os and -Oz. Seems risky to enable anywhere else, as it does add branches - likely predictable ones so maybe no slowdown, but still some risk. Code size numbers: wasm-backend: 196331 + binaryen -Os (before): 182598 + binaryen -Os (with folding): 181943 asm2wasm -Os (before): 172463 asm2wasm -Os (with folding): 168774 So this reduces wasm-backend output by an additional 0.5% than it could before. Mainly this is because the wasm backend already has code folding, whereas on asm2wasm output, where we didn't have folding before, this saves over 2%. The 0.5% improvement on the wasm backend's output might be because this can fold more types of code than LLVM can (it can fold nested control flow, in particular).
* Validate finalization (#1014)Alon Zakai2017-05-181-0/+10
| | | | | | | * validate that types are properly finalized, when in pass-debug mode (BINARYEN_PASS_DEBUG env var): check after each pass is run that the type of each node is equal to the proper type (when finalizing it, i.e., fully recomputing the type). * fix many fuzz bugs found by that. * in particular, fix dce bugs with type changes not being fully updated during code removal. add a new TypeUpdater helper class that lets a pass update types efficiently, by the helper tracking deps between blocks and branches etc., and updating/propagating type changes only as necessary.
* Re-reloop pass (#1009)Alon Zakai2017-05-161-3/+6
| | | | | This adds a pass that converts to a CFG, runs the relooper, and re-generates wasm from that. This depends on flatten-control-flow being run before. The main goal here is to help code generators other than asm2wasm (which already receives relooped code from fastcomp).
* Preserve debug info through the optimizer (#981)Alon Zakai2017-04-281-0/+1
| | | | | | | | | | | | | | * add debugInfo option to passes, and use it to keep debug info alive through optimizations when we need it * add fib testcase for debug info * when preserving debug info, do not move code around call-imports, so debug info intrinsics remain stationary * improve wasm-module-building handling of the single-threaded case: don't create workers, which is more efficient and also nicer for debugging * process debug info in a more precise way, reordering it from being after the node (as it was a comment in JS) to before the node * remove unreachable hack for debug info, which is no longer needed since we reorder them, and make sure to finalize blocks in which we reorder
* fix BINARYEN_PASS_DEBUG option (#908)Alon Zakai2017-02-231-0/+10
| | | | | * fix BINARYEN_PASS_DEBUG option * Add isNested property to passRunner
* Default Walker subclasses to using Visitor<SubType> (#921)jgravelle-google2017-02-231-1/+1
| | | | Most module walkers use PostWalker<T, Visitor<T>>, let that pattern be expressed as simply PostWalker<T>
* Improve handling of implicit traps (#898)Alon Zakai2017-02-061-0/+4
| | | | | | | | * add --ignore-implicit-traps option, and by default do not ignore them, to properly preserve semantics * implicit traps can be reordered, but are side effects and should not be removed * add testing for --ignore-implicit-traps
* Add -O0,-O1,etc. options (#790)Alon Zakai2016-10-191-6/+0
| | | | And use them in wasm-opt and asm2wasm consistently and uniformly.
* Pass options (#788)Alon Zakai2016-10-181-5/+32
| | | | * add PassOptions structure, and use it for new -Os param to wasm-opt
* passRunner debug and validation improvements (#726)Alon Zakai2016-10-021-2/+8
|
* refactor pass hooks, creating a proper way to run code before a pass is runAlon Zakai2016-09-121-9/+10
|
* Add initialization functions for passes to avoid missing pass registration ↵Jukka Jylänki2016-06-211-12/+4
| | | | due to linker dead code elimination. Fixes #577.
* move function parallelism to pass and pass runner, which allows more ↵Alon Zakai2016-06-031-2/+24
| | | | efficient parallel execution (#564)
* add an option to run passes on individual functions, and to get default ↵Alon Zakai2016-06-021-0/+23
| | | | optimization passes suitable for that, or not. refactor visitFunction/Module for this.
* refactor walk logic into walk* and doWalk* methods, for a more regular API ↵Alon Zakai2016-05-301-1/+1
| | | | that is clearer where it should be overridden (#551)
* better error on missing passesAlon Zakai2016-05-051-1/+2
|
* simplify PassRunner API, get a module directlyAlon Zakai2016-05-051-2/+3
|
* create a UnifiedExpressionVisitor for passes that want a single visitor ↵Alon Zakai2016-04-181-1/+1
| | | | function, to avoid confusion with having both visit* and visitExpression in a single pass (#357)
* Add a debug mode to PassRunner, which logs out times (#344)Alon Zakai2016-04-141-0/+6
| | | | | | * add a debug mode to PassRunner, which logs out times * address comments
* refactor default optimization passes to a central locationAlon Zakai2016-04-111-0/+4
|
* De-recurse traversals (#333)Alon Zakai2016-04-111-1/+1
| | | | | | | | | | | | * refactor core walking to not recurse * add a simplify-locals test * reuse parent's non-branchey scan logic in SimpleExecutionWalker, reduce code duplication * update wasm.js * rename things following comments
* refactor wasm traversal code into separate fileAlon Zakai2016-04-061-0/+1
|
* Remove MinifiedPrinter from the header file.Michael2016-02-231-9/+0
|