summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorAlon Zakai <azakai@google.com>2023-05-16 11:03:45 -0700
committerGitHub <noreply@github.com>2023-05-16 11:03:45 -0700
commit972e659bf59740c3ee44129812f95bec143d01a6 (patch)
treef86d70fa692a45e3dfbf951b0d1af06204d4ecf7 /README.md
parent44cd751d9feda7c4b4b6c9d6af1e71541b90abac (diff)
downloadbinaryen-972e659bf59740c3ee44129812f95bec143d01a6.tar.gz
binaryen-972e659bf59740c3ee44129812f95bec143d01a6.tar.bz2
binaryen-972e659bf59740c3ee44129812f95bec143d01a6.zip
Reintroduce wasm-merge (#5709)
We used to have a wasm-merge tool but removed it for a lack of use cases. Recently use cases have been showing up in the wasm GC space and elsewhere, as people are using more diverse toolchains together, for example a project might build some C++ code alongside some wasm GC code. Merging those wasm files together can allow for nice optimizations like inlining and better DCE etc., so it makes sense to have a tool for merging. Background: * Removal: #1969 * Requests: * wasm-merge - why it has been deleted #2174 * Compiling and linking wat files #2276 * wasm-link? #2767 This PR is a compete rewrite of wasm-merge, not a restoration of the original codebase. The original code was quite messy (my fault), and also, since then we've added multi-memory and multi-table which makes things a lot simpler. The linking semantics are as described in the "wasm-link" issue #2767 : all we do is merge normal wasm files together and connect imports and export. That is, we have a graph of modules and their names, and each import to a module name can be resolved to that module. Basically, like a JS bundler would do for JS, or, in other words, we do the same operations as JS code would do to glue wasm modules together at runtime, but at compile time. See the README update in this PR for a concrete example. There are no plans to do more than that simple bundling, so this should not really overlap with wasm-ld's use cases. This should be fairly fast as it works in linear time on the total input code. However, it won't be as fast as wasm-ld, of course, as it does build Binaryen IR for each module. An advantage to working on Binaryen IR is that we can easily do some global DCE after merging, and further optimizations are possible later.
Diffstat (limited to 'README.md')
-rw-r--r--README.md147
1 files changed, 147 insertions, 0 deletions
diff --git a/README.md b/README.md
index a6f2590b1..c2869afaa 100644
--- a/README.md
+++ b/README.md
@@ -223,6 +223,9 @@ This repository contains code that builds the following tools in `bin/`:
performs emscripten-specific passes over it.
* **wasm-ctor-eval**: A tool that can execute functions (or parts of functions)
at compile time.
+ * **wasm-merge**: Merges multiple wasm files into a single file, connecting
+ corresponding imports to exports as it does so. Like a bundler for JS, but
+ for wasm.
* **binaryen.js**: A standalone JavaScript library that exposes Binaryen methods for [creating and optimizing Wasm modules](https://github.com/WebAssembly/binaryen/blob/main/test/binaryen.js/hello-world.js). For builds, see [binaryen.js on npm](https://www.npmjs.com/package/binaryen) (or download it directly from [github](https://raw.githubusercontent.com/AssemblyScript/binaryen.js/master/index.js), [rawgit](https://cdn.rawgit.com/AssemblyScript/binaryen.js/master/index.js), or [unpkg](https://unpkg.com/binaryen@latest/index.js)). Minimal requirements: Node.js v15.8 or Chrome v75 or Firefox v78.
Usage instructions for each are below.
@@ -562,6 +565,150 @@ as mentioned earlier, but there is no limitation on what you can execute here.
Any export from the wasm can be executed, if its contents are suitable. For
example, in Emscripten `wasm-ctor-eval` is even run on `main()` when possible.
+### wasm-merge
+
+`wasm-merge` combines wasm files together. For example, imagine you have a
+project that uses wasm files from multiple toolchains. Then it can be helpful to
+merge them all into a single wasm file before shipping, since in a single wasm
+file the calls between the modules become just normal calls inside a module,
+which allows them to be inlined, dead code eliminated, and so forth, potentially
+improving speed and size.
+
+For example, imagine we have these two wasm files:
+
+```wat
+;; a.wasm
+(module
+ (import "second" "bar" (func $second.bar))
+
+ (export "main" (func $func))
+
+ (func $func
+ (call $second.bar)
+ )
+)
+```
+
+```wat
+;; b.wasm
+(module
+ (import "outside" "log" (func $log (param i32)))
+
+ (export "bar" (func $func))
+
+ (func $func
+ (call $log
+ (i32.const 42)
+ )
+ )
+)
+```
+
+The filenames on your local drive are `a.wasm` and `b.wasm`, but for merging /
+bundling purposes let's say that the first is known as `"first"` and the second
+as `"second"`. That is, we want the first module's import of `"second.bar"` to
+call the function `$func` in the second module. Here is a wasm-merge command for
+that:
+
+```
+wasm-merge a.wasm first b.wasm second -o output.wasm
+```
+
+We give it the first wasm file, then its name, and then the second wasm file
+and then its name. The merged output is this:
+
+```wat
+(module
+ (import "second" "bar" (func $second.bar))
+ (import "outside" "log" (func $log (param i32)))
+
+ (export "main" (func $func))
+ (export "bar" (func $func_2))
+
+ (func $func
+ (call $func_2)
+ )
+
+ (func $func_2
+ (call $log
+ (i32.const 42)
+ )
+ )
+)
+```
+
+`wasm-merge` combined the two files into one, merging their functions, imports,
+etc., all while fixing up name conflicts and connecting corresponding imports to
+exports. In particular, note how `$func` calls `$func_2`, which is exactly what
+we wanted: `$func_2` is the function from the second module (renamed to avoid a
+name collision).
+
+Note that the wasm output in this example could benefit from additional
+optimization. First, the call to `$func_2` can now be easily inlined, so we can
+run `wasm-opt -O3` to do that for us. Also, we may not need all the imports and
+exports, for which we can run
+[wasm-metadce](https://github.com/WebAssembly/binaryen/wiki/Pruning-unneeded-code-in-wasm-files-with-wasm-metadce#example-pruning-exports).
+A good workflow could be to run `wasm-merge`, then `wasm-metadce`, then finish
+with `wasm-opt`.
+
+`wasm-merge` is kind of like a bundler for wasm files, in the sense of a "JS
+bundler" but for wasm. That is, with the wasm files above, imagine that we had
+this JS code to instantiate and connect them at runtime:
+
+```js
+// Compile the first module.
+var first = await fetch("a.wasm");
+first = new WebAssembly.Module(first);
+
+// Compile the first module.
+var second = await fetch("b.wasm");
+second = new WebAssembly.Module(second);
+
+// Instantiate the second, with a JS import.
+second = new WebAssembly.Instance(second, {
+ outside: {
+ log: (value) => {
+ console.log('value:', value);
+ }
+ }
+});
+
+// Instantiate the first, importing from the second.
+first = new WebAssembly.Instance(first, {
+ second: second.exports
+});
+
+// Call the main function.
+first.exports.main();
+```
+
+What `wasm-merge` does is basically what that JS does: it hooks up imports to
+exports, resolving names using the module names you provided. That is, by
+running `wasm-merge` we are moving the work of connecting the modules from
+runtime to compile time. As a result, after running `wasm-merge` we need a lot
+less JS to get the same result:
+
+```js
+// Compile the single module.
+var merged = await fetch("merged.wasm");
+merged = new WebAssembly.Module(merged);
+
+// Instantiate it with a JS import.
+merged = new WebAssembly.Instance(merged, {
+ outside: {
+ log: (value) => {
+ console.log('value:', value);
+ }
+ }
+});
+
+// Call the main function.
+merged.exports.main();
+```
+
+We still need to fetch and compile the merged wasm, and to provide it the JS
+import, but the work to connect two wasm modules is not needed any more.
+
## Testing
```