diff options
author | Thomas Lively <7121787+tlively@users.noreply.github.com> | 2020-11-12 12:26:32 -0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-11-12 12:26:32 -0800 |
commit | 9f8bbcaa986258815a89af7472f27df98a428e33 (patch) | |
tree | afb75d6e0a8d6349429fe323757ad5d8cd5c5ecf /test/example/module-splitting.cpp | |
parent | 7ab4e6e23d0d3dc34ff5a9b07edb65f09b1f5783 (diff) | |
download | binaryen-9f8bbcaa986258815a89af7472f27df98a428e33.tar.gz binaryen-9f8bbcaa986258815a89af7472f27df98a428e33.tar.bz2 binaryen-9f8bbcaa986258815a89af7472f27df98a428e33.zip |
Module splitting (#3317)
Adds the capability to programatically split a module into a primary and
secondary module such that the primary module can be compiled and run before the
secondary module has been instantiated. All calls to secondary functions (i.e.
functions that have been split out into the secondary module) in the primary
module are rewritten to be indirect calls through the table. Initially, the
table slots of all secondary functions contain references to imported
placeholder functions. When the secondary module is instantiated, it will
automatically patch the table to insert references to the original functions.
The process of module splitting involves these steps:
1. Create the new secondary module.
2. Export globals, events, tables, and memories from the primary module and
import them in the secondary module.
3. Move the deferred functions from the primary to the secondary module.
4. For any secondary function exported from the primary module, export in
its place a trampoline function that makes an indirect call to its
placeholder function (and eventually to the original secondary function),
allocating a new table slot for the placeholder if necessary.
5. Rewrite direct calls from primary functions to secondary functions to be
indirect calls to their placeholder functions (and eventually to their
original secondary functions), allocating new table slots for the
placeholders if necessary.
6. For each primary function directly called from a secondary function, export
the primary function if it is not already exported and import it into the
secondary module.
7. Replace all references to secondary functions in the primary module's table
segments with references to imported placeholder functions.
8. Create new active table segments in the secondary module that will replace
all the placeholder function references in the table with references to
their corresponding secondary functions upon instantiation.
Functions can be used or referenced three ways in a WebAssembly module: they can
be exported, called, or placed in a table. The above procedure introduces a
layer of indirection to each of those mechanisms that removes all references to
secondary functions from the primary module but restores the original program's
semantics once the secondary module is instantiated. As more mechanisms that
reference functions are added in the future, such as ref.func instructions, they
will have to be modified to use a similar layer of indirection.
The code as currently written makes a few assumptions about the module that is
being split:
1. It assumes that mutable-globals is allowed. This could be worked around by
introducing wrapper functions for globals and rewriting secondary code that
accesses them, but now that mutable-globals is shipped on all browsers,
hopefully that extra complexity won't be necessary.
2. It assumes that all table segment offsets are constants. This simplifies the
generation of segments to actively patch in the secondary functions without
overwriting any other table slots. This assumption could be relaxed by 1)
having secondary segments re-write primary function slots as well, 2)
allowing addition in segment offsets, or 3) synthesizing a start function to
modify the table instead of using segments.
3. It assumes that each function appears in the table at most once. This isn't
necessarily true in general or even for LLVM output after function
deduplication. Relaxing this assumption would just require slightly more
complex code, so it is a good candidate for a follow up PR.
Future Binaryen work for this feature includes providing a command line tool
exposing this functionality as well as C API, JS API, and fuzzer support. We
will also want to provide a simple instrumentation pass for finding dead or
late-executing functions that would be good candidates for splitting out. It
would also be good to integrate that instrumentation with future function
outlining work so that dead or exceptional basic blocks could be split out into
a separate module.
Diffstat (limited to 'test/example/module-splitting.cpp')
-rw-r--r-- | test/example/module-splitting.cpp | 289 |
1 files changed, 289 insertions, 0 deletions
diff --git a/test/example/module-splitting.cpp b/test/example/module-splitting.cpp new file mode 100644 index 000000000..50fb0f4c5 --- /dev/null +++ b/test/example/module-splitting.cpp @@ -0,0 +1,289 @@ +#include <cassert> +#include <iostream> + +#include "ir/module-splitting.h" +#include "ir/stack-utils.h" +#include "wasm-features.h" +#include "wasm-printing.h" +#include "wasm-s-parser.h" +#include "wasm-validator.h" +#include "wasm.h" + +using namespace wasm; + +std::unique_ptr<Module> parse(char* module) { + auto wasm = std::make_unique<Module>(); + wasm->features = FeatureSet::All; + try { + SExpressionParser parser(module); + Element& root = *parser.root; + SExpressionWasmBuilder builder(*wasm, *root[0], IRProfile::Normal); + } catch (ParseException& p) { + p.dump(std::cerr); + Fatal() << "error in parsing wasm text"; + } + return wasm; +} + +void do_test(const std::set<Name>& keptFuncs, std::string&& module) { + WasmValidator validator; + bool valid; + + auto primary = parse(&module.front()); + valid = validator.validate(*primary); + assert(valid && "before invalid!"); + + std::cout << "Before:\n"; + WasmPrinter::printModule(primary.get()); + + std::cout << "Keeping: "; + if (keptFuncs.size()) { + auto it = keptFuncs.begin(); + std::cout << *it++; + while (it != keptFuncs.end()) { + std::cout << ", " << *it++; + } + } else { + std::cout << "<none>"; + } + std::cout << "\n"; + + ModuleSplitting::Config config; + config.primaryFuncs = keptFuncs; + config.newExportPrefix = "%"; + auto secondary = splitFunctions(*primary, config); + + std::cout << "After:\n"; + WasmPrinter::printModule(primary.get()); + std::cout << "Secondary:\n"; + WasmPrinter::printModule(secondary.get()); + std::cout << "\n\n"; + + valid = validator.validate(*primary); + assert(valid && "after invalid!"); + valid = validator.validate(*secondary); + assert(valid && "secondary invalid!"); +} + +int main() { + // Trivial module + do_test({}, "(module)"); + + // Global stuff + do_test({}, R"( + (module + (memory $mem (shared 3 42)) + (table $tab 3 42 funcref) + (global $glob (mut i32) (i32.const 7)) + (event $e (attr 0) (param i32)) + ))"); + + // Imported global stuff + do_test({}, R"( + (module + (import "env" "mem" (memory $mem (shared 3 42))) + (import "env" "tab" (table $tab 3 42 funcref)) + (import "env" "glob" (global $glob (mut i32))) + (import "env" "e" (event $e (attr 0) (param i32))) + ))"); + + // Exported global stuff + do_test({}, R"( + (module + (memory $mem (shared 3 42)) + (table $tab 3 42 funcref) + (global $glob (mut i32) (i32.const 7)) + (event $e (attr 0) (param i32)) + (export "mem" (memory $mem)) + (export "tab" (table $tab)) + (export "glob" (global $glob)) + (export "e" (event $e)) + ))"); + + // Non-deferred function + do_test({"foo"}, R"( + (module + (func $foo (param i32) (result i32) + (local.get 0) + ) + ))"); + + // Non-deferred exported function + do_test({"foo"}, R"( + (module + (export "foo" (func $foo)) + (func $foo (param i32) (result i32) + (local.get 0) + ) + ))"); + + // Non-deferred function in table + do_test({"foo"}, R"( + (module + (table $table 1 funcref) + (elem (i32.const 0) $foo) + (func $foo (param i32) (result i32) + (local.get 0) + ) + ))"); + + // Non-deferred imported function + do_test({"foo"}, R"( + (module + (import "env" "foo" (func $foo (param i32) (result i32))) + ))"); + + // Non-deferred exported imported function in table at a weird offset + do_test({"foo"}, R"( + (module + (import "env" "foo" (func $foo (param i32) (result i32))) + (table $table 1000 funcref) + (elem (i32.const 42) $foo) + (export "foo" (func $foo)) + ))"); + + // Deferred function + do_test({}, R"( + (module + (func $foo (param i32) (result i32) + (local.get 0) + ) + ))"); + + // Deferred exported function + do_test({}, R"( + (module + (export "foo" (func $foo)) + (func $foo (param i32) (result i32) + (local.get 0) + ) + ))"); + + // Deferred function in table + do_test({}, R"( + (module + (table $table 1 funcref) + (elem (i32.const 0) $foo) + (func $foo (param i32) (result i32) + (local.get 0) + ) + ))"); + + // Deferred exported function in table at a weird offset + do_test({}, R"( + (module + (table $table 1000 funcref) + (elem (i32.const 42) $foo) + (export "foo" (func $foo)) + (func $foo (param i32) (result i32) + (local.get 0) + ) + ))"); + + // Non-deferred function calling non-deferred function + do_test({"foo", "bar"}, R"( + (module + (func $foo + (call $bar) + ) + (func $bar + (nop) + ) + ))"); + + // Deferred function calling non-deferred function + do_test({"bar"}, R"( + (module + (func $foo + (call $bar) + ) + (func $bar + (nop) + ) + ))"); + + // Non-deferred function calling deferred function + do_test({"foo"}, R"( + (module + (func $foo + (call $bar) + ) + (func $bar + (nop) + ) + ))"); + + // Deferred function calling deferred function + do_test({}, R"( + (module + (func $foo + (call $bar) + ) + (func $bar + (nop) + ) + ))"); + + // Deferred function calling non-deferred function with clashing export name + do_test({"foo"}, R"( + (module + (export "%foo" (func $bar)) + (func $foo + (nop) + ) + (func $bar + (call $foo) + ) + ))"); + + // Mixed table 1 + do_test({"bar", "quux"}, R"( + (module + (table $table 4 funcref) + (elem (i32.const 0) $foo $bar $baz $quux) + (func $foo + (nop) + ) + (func $bar + (nop) + ) + (func $baz + (nop) + ) + (func $quux + (nop) + ) + ))"); + + // Mixed table 2 + do_test({"baz"}, R"( + (module + (table $table 4 funcref) + (elem (i32.const 0) $foo $bar $baz $quux) + (func $foo + (nop) + ) + (func $bar + (nop) + ) + (func $baz + (nop) + ) + (func $quux + (nop) + ) + ))"); + + // Mutual recursion with table growth + do_test({"foo"}, R"( + (module + (table $table 1 1 funcref) + (elem (i32.const 0) $foo) + (func $foo (param i32) (result i32) + (call $bar (i32.const 0)) + ) + (func $bar (param i32) (result i32) + (call $foo (i32.const 1)) + ) + ))"); +} |