diff options
author | Alon Zakai <azakai@google.com> | 2021-03-17 11:45:22 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-03-17 11:45:22 -0700 |
commit | f7b3289ed430bd86c083410539a92e12632c554c (patch) | |
tree | d59ad0a8c3b506963706ce8ee8185f8efc5e9bee /src/wasm/wasm-io.cpp | |
parent | fcc6679d0ca148727939397ff1843b84a68a8e61 (diff) | |
download | binaryen-f7b3289ed430bd86c083410539a92e12632c554c.tar.gz binaryen-f7b3289ed430bd86c083410539a92e12632c554c.tar.bz2 binaryen-f7b3289ed430bd86c083410539a92e12632c554c.zip |
Skip function bodies in wasm-emscripten-finalize when we don't need them (#3689)
After sbc100 's work on EM_ASM and EM_JS they are now parsed from
the wasm using exports etc. and so we no longer need to parse function bodies.
As a result if we are not emitting a wasm from wasm-emscripten-finalize then all we are
doing is scanning global structures like imports and exports and emitting metadata
about them. And indeed we do not need to emit a wasm in some cases, specifically
when not optimizing and when using WASM_BIGINT (to avoid needing to
legalize).
We had considering skipping wasm-emscripten-finalize entirely in that situation,
and instead to parse the metadata from the wasm in python on the emscripten
side. However sbc100 had the brilliant idea today to just skip function bodies.
That is very simple to do - no need to write another parser for wasm, and also
look at how simple this PR is - and also it will be faster to run
wasm-emscripten-finalize in this mode than to run python. (With the only
downside that the bytes of the wasm are loaded even if they aren't parsed; but
almost certainly they are in the disk cache anyhow.)
This PR implements that idea: when wasm-emscripten-finalize knows it will
not write a wasm output, it notes "skip function bodies". The binary reader then
skips the bodies and places unreachables there instead (so that the wasm still
validates).
There are no new tests here because this can't be tested - by design it is an
unobservable optimization. (If we could notice the bodies have been skipped,
we would not have skipped them.) This is also why no changes are needed on
the emscripten side to benefit from this speedup. Basically when binaryen sees
it will not need X, it skips parsing of X automatically.
Benchmarking speed, it is as fast as you'd expect: the wasm-emscripten-finalize
step is 15x faster on SQLite (1MB of wasm) and almost 50x faster on the biggest
wasm I have on my drive (40MB of LLVM). (These numbers are on release
builds, without debug info - debug into makes things slower, so the speedup is
lower there, and will need further work.)
Tested manually and also on wasm0 wasm2 other on emscripten.
Diffstat (limited to 'src/wasm/wasm-io.cpp')
-rw-r--r-- | src/wasm/wasm-io.cpp | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/src/wasm/wasm-io.cpp b/src/wasm/wasm-io.cpp index 2fb4af838..f2c5af6db 100644 --- a/src/wasm/wasm-io.cpp +++ b/src/wasm/wasm-io.cpp @@ -51,6 +51,7 @@ void ModuleReader::readBinaryData(std::vector<char>& input, std::unique_ptr<std::ifstream> sourceMapStream; WasmBinaryBuilder parser(wasm, input); parser.setDWARF(DWARF); + parser.setSkipFunctionBodies(skipFunctionBodies); if (sourceMapFilename.size()) { sourceMapStream = make_unique<std::ifstream>(); sourceMapStream->open(sourceMapFilename); |