summaryrefslogtreecommitdiff
path: root/test/example/module-splitting.cpp
diff options
context:
space:
mode:
authorThomas Lively <7121787+tlively@users.noreply.github.com>2020-11-12 12:26:32 -0800
committerGitHub <noreply@github.com>2020-11-12 12:26:32 -0800
commit9f8bbcaa986258815a89af7472f27df98a428e33 (patch)
treeafb75d6e0a8d6349429fe323757ad5d8cd5c5ecf /test/example/module-splitting.cpp
parent7ab4e6e23d0d3dc34ff5a9b07edb65f09b1f5783 (diff)
downloadbinaryen-9f8bbcaa986258815a89af7472f27df98a428e33.tar.gz
binaryen-9f8bbcaa986258815a89af7472f27df98a428e33.tar.bz2
binaryen-9f8bbcaa986258815a89af7472f27df98a428e33.zip
Module splitting (#3317)
Adds the capability to programatically split a module into a primary and secondary module such that the primary module can be compiled and run before the secondary module has been instantiated. All calls to secondary functions (i.e. functions that have been split out into the secondary module) in the primary module are rewritten to be indirect calls through the table. Initially, the table slots of all secondary functions contain references to imported placeholder functions. When the secondary module is instantiated, it will automatically patch the table to insert references to the original functions. The process of module splitting involves these steps: 1. Create the new secondary module. 2. Export globals, events, tables, and memories from the primary module and import them in the secondary module. 3. Move the deferred functions from the primary to the secondary module. 4. For any secondary function exported from the primary module, export in its place a trampoline function that makes an indirect call to its placeholder function (and eventually to the original secondary function), allocating a new table slot for the placeholder if necessary. 5. Rewrite direct calls from primary functions to secondary functions to be indirect calls to their placeholder functions (and eventually to their original secondary functions), allocating new table slots for the placeholders if necessary. 6. For each primary function directly called from a secondary function, export the primary function if it is not already exported and import it into the secondary module. 7. Replace all references to secondary functions in the primary module's table segments with references to imported placeholder functions. 8. Create new active table segments in the secondary module that will replace all the placeholder function references in the table with references to their corresponding secondary functions upon instantiation. Functions can be used or referenced three ways in a WebAssembly module: they can be exported, called, or placed in a table. The above procedure introduces a layer of indirection to each of those mechanisms that removes all references to secondary functions from the primary module but restores the original program's semantics once the secondary module is instantiated. As more mechanisms that reference functions are added in the future, such as ref.func instructions, they will have to be modified to use a similar layer of indirection. The code as currently written makes a few assumptions about the module that is being split: 1. It assumes that mutable-globals is allowed. This could be worked around by introducing wrapper functions for globals and rewriting secondary code that accesses them, but now that mutable-globals is shipped on all browsers, hopefully that extra complexity won't be necessary. 2. It assumes that all table segment offsets are constants. This simplifies the generation of segments to actively patch in the secondary functions without overwriting any other table slots. This assumption could be relaxed by 1) having secondary segments re-write primary function slots as well, 2) allowing addition in segment offsets, or 3) synthesizing a start function to modify the table instead of using segments. 3. It assumes that each function appears in the table at most once. This isn't necessarily true in general or even for LLVM output after function deduplication. Relaxing this assumption would just require slightly more complex code, so it is a good candidate for a follow up PR. Future Binaryen work for this feature includes providing a command line tool exposing this functionality as well as C API, JS API, and fuzzer support. We will also want to provide a simple instrumentation pass for finding dead or late-executing functions that would be good candidates for splitting out. It would also be good to integrate that instrumentation with future function outlining work so that dead or exceptional basic blocks could be split out into a separate module.
Diffstat (limited to 'test/example/module-splitting.cpp')
-rw-r--r--test/example/module-splitting.cpp289
1 files changed, 289 insertions, 0 deletions
diff --git a/test/example/module-splitting.cpp b/test/example/module-splitting.cpp
new file mode 100644
index 000000000..50fb0f4c5
--- /dev/null
+++ b/test/example/module-splitting.cpp
@@ -0,0 +1,289 @@
+#include <cassert>
+#include <iostream>
+
+#include "ir/module-splitting.h"
+#include "ir/stack-utils.h"
+#include "wasm-features.h"
+#include "wasm-printing.h"
+#include "wasm-s-parser.h"
+#include "wasm-validator.h"
+#include "wasm.h"
+
+using namespace wasm;
+
+std::unique_ptr<Module> parse(char* module) {
+ auto wasm = std::make_unique<Module>();
+ wasm->features = FeatureSet::All;
+ try {
+ SExpressionParser parser(module);
+ Element& root = *parser.root;
+ SExpressionWasmBuilder builder(*wasm, *root[0], IRProfile::Normal);
+ } catch (ParseException& p) {
+ p.dump(std::cerr);
+ Fatal() << "error in parsing wasm text";
+ }
+ return wasm;
+}
+
+void do_test(const std::set<Name>& keptFuncs, std::string&& module) {
+ WasmValidator validator;
+ bool valid;
+
+ auto primary = parse(&module.front());
+ valid = validator.validate(*primary);
+ assert(valid && "before invalid!");
+
+ std::cout << "Before:\n";
+ WasmPrinter::printModule(primary.get());
+
+ std::cout << "Keeping: ";
+ if (keptFuncs.size()) {
+ auto it = keptFuncs.begin();
+ std::cout << *it++;
+ while (it != keptFuncs.end()) {
+ std::cout << ", " << *it++;
+ }
+ } else {
+ std::cout << "<none>";
+ }
+ std::cout << "\n";
+
+ ModuleSplitting::Config config;
+ config.primaryFuncs = keptFuncs;
+ config.newExportPrefix = "%";
+ auto secondary = splitFunctions(*primary, config);
+
+ std::cout << "After:\n";
+ WasmPrinter::printModule(primary.get());
+ std::cout << "Secondary:\n";
+ WasmPrinter::printModule(secondary.get());
+ std::cout << "\n\n";
+
+ valid = validator.validate(*primary);
+ assert(valid && "after invalid!");
+ valid = validator.validate(*secondary);
+ assert(valid && "secondary invalid!");
+}
+
+int main() {
+ // Trivial module
+ do_test({}, "(module)");
+
+ // Global stuff
+ do_test({}, R"(
+ (module
+ (memory $mem (shared 3 42))
+ (table $tab 3 42 funcref)
+ (global $glob (mut i32) (i32.const 7))
+ (event $e (attr 0) (param i32))
+ ))");
+
+ // Imported global stuff
+ do_test({}, R"(
+ (module
+ (import "env" "mem" (memory $mem (shared 3 42)))
+ (import "env" "tab" (table $tab 3 42 funcref))
+ (import "env" "glob" (global $glob (mut i32)))
+ (import "env" "e" (event $e (attr 0) (param i32)))
+ ))");
+
+ // Exported global stuff
+ do_test({}, R"(
+ (module
+ (memory $mem (shared 3 42))
+ (table $tab 3 42 funcref)
+ (global $glob (mut i32) (i32.const 7))
+ (event $e (attr 0) (param i32))
+ (export "mem" (memory $mem))
+ (export "tab" (table $tab))
+ (export "glob" (global $glob))
+ (export "e" (event $e))
+ ))");
+
+ // Non-deferred function
+ do_test({"foo"}, R"(
+ (module
+ (func $foo (param i32) (result i32)
+ (local.get 0)
+ )
+ ))");
+
+ // Non-deferred exported function
+ do_test({"foo"}, R"(
+ (module
+ (export "foo" (func $foo))
+ (func $foo (param i32) (result i32)
+ (local.get 0)
+ )
+ ))");
+
+ // Non-deferred function in table
+ do_test({"foo"}, R"(
+ (module
+ (table $table 1 funcref)
+ (elem (i32.const 0) $foo)
+ (func $foo (param i32) (result i32)
+ (local.get 0)
+ )
+ ))");
+
+ // Non-deferred imported function
+ do_test({"foo"}, R"(
+ (module
+ (import "env" "foo" (func $foo (param i32) (result i32)))
+ ))");
+
+ // Non-deferred exported imported function in table at a weird offset
+ do_test({"foo"}, R"(
+ (module
+ (import "env" "foo" (func $foo (param i32) (result i32)))
+ (table $table 1000 funcref)
+ (elem (i32.const 42) $foo)
+ (export "foo" (func $foo))
+ ))");
+
+ // Deferred function
+ do_test({}, R"(
+ (module
+ (func $foo (param i32) (result i32)
+ (local.get 0)
+ )
+ ))");
+
+ // Deferred exported function
+ do_test({}, R"(
+ (module
+ (export "foo" (func $foo))
+ (func $foo (param i32) (result i32)
+ (local.get 0)
+ )
+ ))");
+
+ // Deferred function in table
+ do_test({}, R"(
+ (module
+ (table $table 1 funcref)
+ (elem (i32.const 0) $foo)
+ (func $foo (param i32) (result i32)
+ (local.get 0)
+ )
+ ))");
+
+ // Deferred exported function in table at a weird offset
+ do_test({}, R"(
+ (module
+ (table $table 1000 funcref)
+ (elem (i32.const 42) $foo)
+ (export "foo" (func $foo))
+ (func $foo (param i32) (result i32)
+ (local.get 0)
+ )
+ ))");
+
+ // Non-deferred function calling non-deferred function
+ do_test({"foo", "bar"}, R"(
+ (module
+ (func $foo
+ (call $bar)
+ )
+ (func $bar
+ (nop)
+ )
+ ))");
+
+ // Deferred function calling non-deferred function
+ do_test({"bar"}, R"(
+ (module
+ (func $foo
+ (call $bar)
+ )
+ (func $bar
+ (nop)
+ )
+ ))");
+
+ // Non-deferred function calling deferred function
+ do_test({"foo"}, R"(
+ (module
+ (func $foo
+ (call $bar)
+ )
+ (func $bar
+ (nop)
+ )
+ ))");
+
+ // Deferred function calling deferred function
+ do_test({}, R"(
+ (module
+ (func $foo
+ (call $bar)
+ )
+ (func $bar
+ (nop)
+ )
+ ))");
+
+ // Deferred function calling non-deferred function with clashing export name
+ do_test({"foo"}, R"(
+ (module
+ (export "%foo" (func $bar))
+ (func $foo
+ (nop)
+ )
+ (func $bar
+ (call $foo)
+ )
+ ))");
+
+ // Mixed table 1
+ do_test({"bar", "quux"}, R"(
+ (module
+ (table $table 4 funcref)
+ (elem (i32.const 0) $foo $bar $baz $quux)
+ (func $foo
+ (nop)
+ )
+ (func $bar
+ (nop)
+ )
+ (func $baz
+ (nop)
+ )
+ (func $quux
+ (nop)
+ )
+ ))");
+
+ // Mixed table 2
+ do_test({"baz"}, R"(
+ (module
+ (table $table 4 funcref)
+ (elem (i32.const 0) $foo $bar $baz $quux)
+ (func $foo
+ (nop)
+ )
+ (func $bar
+ (nop)
+ )
+ (func $baz
+ (nop)
+ )
+ (func $quux
+ (nop)
+ )
+ ))");
+
+ // Mutual recursion with table growth
+ do_test({"foo"}, R"(
+ (module
+ (table $table 1 1 funcref)
+ (elem (i32.const 0) $foo)
+ (func $foo (param i32) (result i32)
+ (call $bar (i32.const 0))
+ )
+ (func $bar (param i32) (result i32)
+ (call $foo (i32.const 1))
+ )
+ ))");
+}