Skip function bodies in wasm-emscripten-finalize when we don't need them (#3689)

After sbc100 's work on EM_ASM and EM_JS they are now parsed from the wasm using exports etc. and so we no longer need to parse function bodies. As a result if we are not emitting a wasm from wasm-emscripten-finalize then all we are doing is scanning global structures like imports and exports and emitting metadata about them. And indeed we do not need to emit a wasm in some cases, specifically when not optimizing and when using WASM_BIGINT (to avoid needing to legalize). We had considering skipping wasm-emscripten-finalize entirely in that situation, and instead to parse the metadata from the wasm in python on the emscripten side. However sbc100 had the brilliant idea today to just skip function bodies. That is very simple to do - no need to write another parser for wasm, and also look at how simple this PR is - and also it will be faster to run wasm-emscripten-finalize in this mode than to run python. (With the only downside that the bytes of the wasm are loaded even if they aren't parsed; but almost certainly they are in the disk cache anyhow.) This PR implements that idea: when wasm-emscripten-finalize knows it will not write a wasm output, it notes "skip function bodies". The binary reader then skips the bodies and places unreachables there instead (so that the wasm still validates). There are no new tests here because this can't be tested - by design it is an unobservable optimization. (If we could notice the bodies have been skipped, we would not have skipped them.) This is also why no changes are needed on the emscripten side to benefit from this speedup. Basically when binaryen sees it will not need X, it skips parsing of X automatically. Benchmarking speed, it is as fast as you'd expect: the wasm-emscripten-finalize step is 15x faster on SQLite (1MB of wasm) and almost 50x faster on the biggest wasm I have on my drive (40MB of LLVM). (These numbers are on release builds, without debug info - debug into makes things slower, so the speedup is lower there, and will need further work.) Tested manually and also on wasm0 wasm2 other on emscripten.
author: Alon Zakai <azakai@google.com> 2021-03-17 11:45:22 -0700
committer: GitHub <noreply@github.com> 2021-03-17 11:45:22 -0700
commit: f7b3289ed430bd86c083410539a92e12632c554c (patch)
tree: d59ad0a8c3b506963706ce8ee8185f8efc5e9bee /src/wasm/wasm-io.cpp
parent: fcc6679d0ca148727939397ff1843b84a68a8e61 (diff)
download: binaryen-f7b3289ed430bd86c083410539a92e12632c554c.tar.gz
binaryen-f7b3289ed430bd86c083410539a92e12632c554c.tar.bz2
binaryen-f7b3289ed430bd86c083410539a92e12632c554c.zip
1 files changed, 1 insertions, 0 deletions
diff --git a/src/wasm/wasm-io.cpp b/src/wasm/wasm-io.cpp
index 2fb4af838..f2c5af6db 100644
--- a/src/wasm/wasm-io.cpp
+++ b/src/wasm/wasm-io.cpp
@@ -51,6 +51,7 @@ void ModuleReader::readBinaryData(std::vector<char>& input,
   std::unique_ptr<std::ifstream> sourceMapStream;
   WasmBinaryBuilder parser(wasm, input);
   parser.setDWARF(DWARF);
+  parser.setSkipFunctionBodies(skipFunctionBodies);
   if (sourceMapFilename.size()) {
     sourceMapStream = make_unique<std::ifstream>();
     sourceMapStream->open(sourceMapFilename);
author	Alon Zakai <azakai@google.com>	2021-03-17 11:45:22 -0700
committer	GitHub <noreply@github.com>	2021-03-17 11:45:22 -0700
commit	f7b3289ed430bd86c083410539a92e12632c554c (patch)
tree	d59ad0a8c3b506963706ce8ee8185f8efc5e9bee /src/wasm/wasm-io.cpp
parent	fcc6679d0ca148727939397ff1843b84a68a8e61 (diff)
download	binaryen-f7b3289ed430bd86c083410539a92e12632c554c.tar.gz binaryen-f7b3289ed430bd86c083410539a92e12632c554c.tar.bz2 binaryen-f7b3289ed430bd86c083410539a92e12632c554c.zip