| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
Implement the basic infrastructure for the full WAT parser with just enough
detail to parse basic modules that contain only imported globals. Parsing
functions correspond to elements of the grammar in the text specification and
are templatized over context types that correspond to each phase of parsing.
Errors are explicitly propagated via `Result<T>` and `MaybeResult<T>` types.
Follow-on PRs will implement additional phases of parsing and parsing for new
elements in the grammar.
|
|
|
|
|
|
| |
Even when other names are stripped, it can be useful for wasm-split to preserve
the module name so that the split modules can be differentiated in stack traces.
Adding this option to wasm-split requires adding similar options to ModuleWriter
and WasmBinaryWriter.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
output (#3698)
When not writing output we don't need debug info, as it is not relevant for
our metadata. This saves loading and interning all the names, which takes
several seconds on massive inputs.
This is possible in principle in other tools, but this does not change anything
in them for now. (We do use names internally in some nontrivial ways without
opting in to it, so that would require further refactoring. Also the other tools
almost always do write an output.)
This is not 100% unobservable. If validation fails then the validation error would
just contain the function index instead of the name from the Names section if
there is one. However finalize does not validate atm so that would only matter
if we change that later.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After sbc100 's work on EM_ASM and EM_JS they are now parsed from
the wasm using exports etc. and so we no longer need to parse function bodies.
As a result if we are not emitting a wasm from wasm-emscripten-finalize then all we are
doing is scanning global structures like imports and exports and emitting metadata
about them. And indeed we do not need to emit a wasm in some cases, specifically
when not optimizing and when using WASM_BIGINT (to avoid needing to
legalize).
We had considering skipping wasm-emscripten-finalize entirely in that situation,
and instead to parse the metadata from the wasm in python on the emscripten
side. However sbc100 had the brilliant idea today to just skip function bodies.
That is very simple to do - no need to write another parser for wasm, and also
look at how simple this PR is - and also it will be faster to run
wasm-emscripten-finalize in this mode than to run python. (With the only
downside that the bytes of the wasm are loaded even if they aren't parsed; but
almost certainly they are in the disk cache anyhow.)
This PR implements that idea: when wasm-emscripten-finalize knows it will
not write a wasm output, it notes "skip function bodies". The binary reader then
skips the bodies and places unreachables there instead (so that the wasm still
validates).
There are no new tests here because this can't be tested - by design it is an
unobservable optimization. (If we could notice the bodies have been skipped,
we would not have skipped them.) This is also why no changes are needed on
the emscripten side to benefit from this speedup. Basically when binaryen sees
it will not need X, it skips parsing of X automatically.
Benchmarking speed, it is as fast as you'd expect: the wasm-emscripten-finalize
step is 15x faster on SQLite (1MB of wasm) and almost 50x faster on the biggest
wasm I have on my drive (40MB of LLVM). (These numbers are on release
builds, without debug info - debug into makes things slower, so the speedup is
lower there, and will need further work.)
Tested manually and also on wasm0 wasm2 other on emscripten.
|
|
|
|
|
| |
Adds an IR profile to each function so the validator can determine
which validation rules to apply and adds a flag to have the wast
parser set the profile to Poppy for testing purposes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optionally track the binary format code section offsets,
that is, when loading a binary, remember where each IR
node was read from. This is necessary for DWARF
debug info, as these are the offsets DWARF refers to.
(Note that eventually we may want to do something
else, like first read the DWARF and only then add
debug info annotations into the IR in a more LLVM-like
manner, but this is more straightforward and should be
enough to update debug lines and ranges).
This tracking adds noticeable overhead - every single
IR node adds an entry in a map - so avoid it unless
actually necessary. Specifically, if the user passes in
-g and there are actually DWARF sections in the
binary, and we are not about to remove those sections,
then we need it.
Print binary format code section offsets in text, when
printing with -g. This will help debug and test dwarf
support. It looks like
;; code offset: 0x7
as an annotation right before each node.
Also add support for -g in wasm-opt tests (unlike
a pass, it has just one - as a prefix).
Helps #2400
|
|
|
|
|
|
| |
This means that debugging/tracing can now be enabled and controlled
centrally without managing and passing state around the codebase.
|
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
|
| |
This is necessary to write tests that don't require temporary files,
such as in #1948, and is generally useful.
|
|
|
|
|
|
|
|
|
|
| |
* support source map input in wasm-opt, refactoring the loading code into wasm-io
* use wasm-io in wasm-as
* support output source maps in wasm-opt
* add a test for wasm-opt and source maps
|
|
|
|
|
|
|
| |
* wasm-link-metadata: Use `__data_end` symbol.
* Add --global-base param to emscripten-wasm-finalize to compute staticBump properly
* Let ModuleWriter write to a provided Output object
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Skeleton of a beginning of o2wasm, WIP and probably not going to be used
* Get building post-cherry-pick
* ast->ir, remove commented out code, include a debug module print because linking
* Read linking section, print emscripten metadata json
* WasmBinaryWriter emits user sections on Module
* Remove debugging prints, everything that isn't needed to build metadata
* Rename o2wasm to lld-metadata
* lld-metadata support for outputting to file
* Use tables index instead of function index for initializer functions
* Add lld-emscripten tool to add emscripten-runtime functions to wasm modules (built with lld)
* Handle EM_ASM in lld-emscripten
* Add a list of functions to forcibly export (for initializer functions)
* Disable incorrect initializer function reading
* Add error printing when parsing .o files in lld-metadata
* Remove ';; METADATA: ' prefix from lld-metadata, output is now standalone json
* Support em_asm consts that aren't at the start of a segment
* Initial test framework for lld-metadata tool
* Add em_asm test
* Add support for WASM_INIT_FUNCS in the linking section
* Remove reloc section parsing because it's unused
* lld-emscripten can read and write text
* Add test harness for lld-emscripten
* Export all functions for now
* Add missing lld test output
* Add support for reading object files differently
Only difference so far is in importing mutable globals being an object
file representation for symbols, but invalid wasm.
* Update help strings
* Update linking tests for stackAlloc fix
* Rename lld-emscripten,lld-metadata to wasm-emscripten-finalize,wasm-link-metadata
* Add help text to header comments
* auto& instead of auto &
* Extract LinkType to abi/wasm-object.h
* Remove special handling for wasm object file reading, allow mutable globals
* Add braces around default switch case
* Fix flake8 errors
* Handle generating dyncall thunks for imports as well
* Use explicit bool for stackPointerGlobal
* Use glob patterns for lld file iteration
* Use __wasm_call_ctors for all initializer functions
|
|
|
|
|
|
|
|
| |
(#1017)
* Extends wasm-as, wasm-dis and s2wasm to consume debug locations.
* Exports source map from asm2wasm
|
|
|
|
|
| |
Add wasm-ctor-eval, which evaluates functions at compile time - typically static constructor functions - and applies their effects into memory, saving work at startup. If we encounter something we can't evaluate at compile time in our interpreter, stop there.
This is similar to ctor_evaller.py in emscripten (which was for asm.js).
|
|
* Added ModuleReader/Writer classes that support text and binary I/O
* Use them in wasm-opt and asm2wasm
|