diff options
author | Alon Zakai <alonzakai@gmail.com> | 2018-07-30 17:36:12 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2018-07-30 17:36:12 -0700 |
commit | 353e2c4fcb7662b8c73f5673fd0fc21bcb7287c6 (patch) | |
tree | eda3800dcd9751f29492fc5732f4af41c0e473c8 /src/passes/pass.cpp | |
parent | 0483d81ad9a665d404f2e2182e718ccbf0592be7 (diff) | |
download | binaryen-353e2c4fcb7662b8c73f5673fd0fc21bcb7287c6.tar.gz binaryen-353e2c4fcb7662b8c73f5673fd0fc21bcb7287c6.tar.bz2 binaryen-353e2c4fcb7662b8c73f5673fd0fc21bcb7287c6.zip |
Stack IR (#1623)
This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format.
This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented,
DCE
local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get.
Block removal - remove any blocks with no branches, as they are valid in wasm binary format.
Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR.
Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes:
Generate stack IR
Optimize stack IR (might be worth splitting out into separate passes eventually)
Print stack IR for debugging purposes
Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR.
Miscellaneous notes:
Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways.
Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present.
A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above.
The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed.
Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid.
Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
Diffstat (limited to 'src/passes/pass.cpp')
-rw-r--r-- | src/passes/pass.cpp | 138 |
1 files changed, 136 insertions, 2 deletions
diff --git a/src/passes/pass.cpp b/src/passes/pass.cpp index c0354524d..97151f847 100644 --- a/src/passes/pass.cpp +++ b/src/passes/pass.cpp @@ -22,6 +22,7 @@ #include <pass.h> #include <wasm-validator.h> #include <wasm-io.h> +#include "ir/hashed.h" namespace wasm { @@ -76,6 +77,7 @@ void PassRegistry::registerPasses() { registerPass("flatten", "flattens out code, removing nesting", createFlattenPass); registerPass("fpcast-emu", "emulates function pointer casts, allowing incorrect indirect calls to (sometimes) work", createFuncCastEmulationPass); registerPass("func-metrics", "reports function metrics", createFunctionMetricsPass); + registerPass("generate-stack-ir", "generate Stack IR", createGenerateStackIRPass); registerPass("inlining", "inline functions (you probably want inlining-optimizing)", createInliningPass); registerPass("inlining-optimizing", "inline functions and optimizes where we inlined", createInliningOptimizingPass); registerPass("legalize-js-interface", "legalizes i64 types on the import/export boundary", createLegalizeJSInterfacePass); @@ -90,6 +92,7 @@ void PassRegistry::registerPasses() { registerPass("metrics", "reports metrics", createMetricsPass); registerPass("nm", "name list", createNameListPass); registerPass("optimize-instructions", "optimizes instruction combinations", createOptimizeInstructionsPass); + registerPass("optimize-stack-ir", "optimize Stack IR", createOptimizeStackIRPass); registerPass("pick-load-signs", "pick load signs based on their uses", createPickLoadSignsPass); registerPass("post-emscripten", "miscellaneous optimizations for Emscripten-generated code", createPostEmscriptenPass); registerPass("precompute", "computes compile-time evaluatable expressions", createPrecomputePass); @@ -98,6 +101,7 @@ void PassRegistry::registerPasses() { registerPass("print-minified", "print in minified s-expression format", createMinifiedPrinterPass); registerPass("print-full", "print in full s-expression format", createFullPrinterPass); registerPass("print-call-graph", "print call graph", createPrintCallGraphPass); + registerPass("print-stack-ir", "print out Stack IR (useful for internal debugging)", createPrintStackIRPass); registerPass("relooper-jump-threading", "thread relooper jumps (fastcomp output only)", createRelooperJumpThreadingPass); registerPass("remove-non-js-ops", "removes operations incompatible with js", createRemoveNonJSOpsPass); registerPass("remove-imports", "removes imports and replaces them with nops", createRemoveImportsPass); @@ -201,6 +205,12 @@ void PassRunner::addDefaultGlobalOptimizationPostPasses() { add("duplicate-function-elimination"); // optimizations show more functions as duplicate add("remove-unused-module-elements"); add("memory-packing"); + // perform Stack IR optimizations here, at the very end of the + // optimization pipeline + if (options.optimizeLevel >= 2 || options.shrinkLevel >= 1) { + add("generate-stack-ir"); + add("optimize-stack-ir"); + } } static void dumpWast(Name name, Module* wasm) { @@ -252,7 +262,7 @@ void PassRunner::run() { runPassOnFunction(pass, func.get()); } } else { - pass->run(this, wasm); + runPass(pass); } auto after = std::chrono::steady_clock::now(); std::chrono::duration<double> diff = after - before; @@ -320,7 +330,7 @@ void PassRunner::run() { stack.push_back(pass); } else { flush(); - pass->run(this, wasm); + runPass(pass); } } flush(); @@ -347,11 +357,135 @@ void PassRunner::doAdd(Pass* pass) { pass->prepareToRun(this, wasm); } +// Checks that the state is valid before and after a +// pass runs on a function. We run these extra checks when +// pass-debug mode is enabled. +struct AfterEffectFunctionChecker { + Function* func; + Name name; + + // Check Stack IR state: if the main IR changes, there should be no + // stack IR, as the stack IR would be wrong. + bool beganWithStackIR; + HashType originalFunctionHash; + + // In the creator we can scan the state of the module and function before the + // pass runs. + AfterEffectFunctionChecker(Function* func) : func(func), name(func->name) { + beganWithStackIR = func->stackIR != nullptr; + if (beganWithStackIR) { + originalFunctionHash = FunctionHasher::hashFunction(func); + } + } + + // This is called after the pass is run, at which time we can check things. + void check() { + assert(func->name == name); // no global module changes should have occurred + if (beganWithStackIR && func->stackIR) { + auto after = FunctionHasher::hashFunction(func); + if (after != originalFunctionHash) { + Fatal() << "[PassRunner] PASS_DEBUG check failed: had Stack IR before and after the pass ran, and the pass modified the main IR, which invalidates Stack IR - pass should have been marked 'modifiesBinaryenIR'"; + } + } + } +}; + +// Runs checks on the entire module, in a non-function-parallel pass. +// In particular, in such a pass functions may be removed or renamed, track that. +struct AfterEffectModuleChecker { + Module* module; + + std::vector<AfterEffectFunctionChecker> checkers; + + bool beganWithAnyStackIR; + + AfterEffectModuleChecker(Module* module) : module(module) { + for (auto& func : module->functions) { + checkers.emplace_back(func.get()); + } + beganWithAnyStackIR = hasAnyStackIR(); + } + + void check() { + if (beganWithAnyStackIR && hasAnyStackIR()) { + // If anything changed to the functions, that's not good. + if (checkers.size() != module->functions.size()) { + error(); + } + for (Index i = 0; i < checkers.size(); i++) { + // Did a pointer change? (a deallocated function could cause that) + if (module->functions[i].get() != checkers[i].func || + module->functions[i]->body != checkers[i].func->body) { + error(); + } + // Did a name change? + if (module->functions[i]->name != checkers[i].name) { + error(); + } + } + // Global function state appears to not have been changed: the same + // functions are there. Look into their contents. + for (auto& checker : checkers) { + checker.check(); + } + } + } + + void error() { + Fatal() << "[PassRunner] PASS_DEBUG check failed: had Stack IR before and after the pass ran, and the pass modified global function state - pass should have been marked 'modifiesBinaryenIR'"; + } + + bool hasAnyStackIR() { + for (auto& func : module->functions) { + if (func->stackIR) { + return true; + } + } + return false; + } +}; + +void PassRunner::runPass(Pass* pass) { + std::unique_ptr<AfterEffectModuleChecker> checker; + if (getPassDebug()) { + checker = std::unique_ptr<AfterEffectModuleChecker>( + new AfterEffectModuleChecker(wasm)); + } + pass->run(this, wasm); + handleAfterEffects(pass); + if (getPassDebug()) { + checker->check(); + } +} + void PassRunner::runPassOnFunction(Pass* pass, Function* func) { assert(pass->isFunctionParallel()); // function-parallel passes get a new instance per function auto instance = std::unique_ptr<Pass>(pass->create()); + std::unique_ptr<AfterEffectFunctionChecker> checker; + if (getPassDebug()) { + checker = std::unique_ptr<AfterEffectFunctionChecker>( + new AfterEffectFunctionChecker(func)); + } instance->runOnFunction(this, wasm, func); + handleAfterEffects(pass, func); + if (getPassDebug()) { + checker->check(); + } +} + +void PassRunner::handleAfterEffects(Pass* pass, Function* func) { + if (pass->modifiesBinaryenIR()) { + // If Binaryen IR is modified, Stack IR must be cleared - it would + // be out of sync in a potentially dangerous way. + if (func) { + func->stackIR.reset(nullptr); + } else { + for (auto& func : wasm->functions) { + func->stackIR.reset(nullptr); + } + } + } } int PassRunner::getPassDebug() { |