| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Also, always output high level metrics even when zero.
|
|
|
|
|
| |
Parse the formats allowed by the spec proposal and emit the i32x4
canonical format.
|
|
|
|
|
|
| |
Refactors features into a new wasm-features.h file and updates the
validator to check that all types are allowed. Currently this is only
relevant for the v128 SIMD type, but new types will be added in the
future. The test for this change is in #1948.
|
|
|
|
| |
This is necessary to write tests that don't require temporary files,
such as in #1948, and is generally useful.
|
|
|
|
|
|
| |
With this we can write stuff like:
const wasm::Expression* p;
const wasm::Binary* q = p->cast<wasm::Binary>();
|
|
|
| |
Removed semicolons that cause errors when compiling with -pedantic-errors.
|
| |
|
|
|
| |
And run it in wasm-emscripten-finalize. This will prevent the emscripten output from changing when the target features section lands in LLVM.
|
|
|
|
| |
unconditionally erasing it in all --strip passes (#1939)
|
|
|
|
|
|
|
|
|
| |
(#1944)
We expect the stack pointer to be of a certain type. This fixes
a segfault we are seeing when passed a binary which doesn't quite
meet our expectations.
|
|
|
|
|
|
|
| |
This PR adds
void BinaryenConstGetValueV128(BinaryenExpressionRef expr, uint8_t* out);
to the C-API and uses it in Binaryen.getExpressionInfo in the JS-API.
|
|
|
|
| |
We now implement addFunction by creating a wasm module to wrap
that JS function and simply adding it to the table.
|
| |
|
|
|
|
|
|
|
| |
That caused it to miss switch targets, and a code-folding bug.
Fixes #1838
Sadly the fuzzer didn't find this because code folding looks for very particular code patterns that are unlikely to be emitted randomly.
|
|
|
|
| |
Noramlly --help is considered normal output not error output. For
example its normally to pipe the output of --help to a pager.
|
|
|
|
| |
that one var by reusing a param
|
|
|
|
| |
uses of the original add, as otherwise we may just be adding work (both an offset, and an add). Refactor local-utils.h, and make UnneededSetRemover also check for side effects, so it cleanly removes all traces of unneeded sets.
|
|
|
|
|
| |
Multiple propagations may be possible in some cases, like nested
structs in C.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial OptimizeAddedConstants pass did not try to handle the case of non-ssa locals. However, that can happen, and optimizing those cases too improves us by almost 1% of code size on some large benchmarks like bullet.
How this works is that if we see
b = a + 10
a = c
load(b)
then we copy the base value at the add,
a' = a
b = a' + 10
a = c
load(a', offset=10)
This no longer has a guarantee of improving code size, since in theory both b and a may have other uses. However, in practice it's very common for b to be optimized out later.
|
|
|
|
|
| |
This can happen in emscripten if we run fpcast-emu after previous passes created dynCalls already. If so, it's fine to just do nothing.
Fixes emscripten-core/emscripten#8229
|
|
|
|
|
|
|
|
| |
This PR changes the formatting of v128.const literals in text format / stack ir like so
- v128.const i32 0x1 0x2 0x3 0x4 0x5 0x6 0x7 0x8 0x9 0xa 0xb 0xc 0xd 0xe 0xf 0x80
+ v128.const i32 0x04030201 0x08070605 0x0c0b0a09 0x800f0e0d
Recently hit this when trying to load Binaryen generated text format with WABT, which errored with `error: unexpected token 0x5, expected ).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See #1919 - we did not do this consistently before.
This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten.
Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this:
y = x + 10
[..]
load(y)
[..]
load(y)
=>
y = x + 10
[..]
load(x, offset=10)
[..]
load(x, offset=10)
That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state.
The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe.
This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023.
This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
|
|
|
|
|
|
| |
Fixes #1921
Signed-off-by: Bogdan Vaneev <warchantua@gmail.com>
|
|
|
|
|
| |
* optimize normally with debug info - some of it may be removed, but that's the price of higher optimization levels, and by optimizing normally in profiling and -g2 etc. builds they are more comparable to normal ones, yielding better data
* copy debug locations automatically in replaceCurrent in wasm-traversal, so optimization passes at least by default will preserve debuggability
|
| |
|
|
|
|
|
| |
This refactors the hashing and comparison code to use a single immediate-value iterator. This makes us have a single place that knows the list of immediate fields in every node type, instead of 2.
This also fixes a few bugs found by doing that. In particular, this makes us slightly slower than before since we are hashing more fields.
|
|
|
|
|
|
|
| |
* Finds functions whose return value is always dropped, and removes the return.
* Run multiple iterations of the pass, as one can enable others.
* Do not run DeadArgumentElimination at all if debug info is present (with these improvements, it became much more likely to destroy debug info).
Saves 2.5% on hello world, because of some simple libc calls.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Checks if a value is being dropped higher up, like
```
(drop
(block i32
(block i32
(i32.const 1)
)
)
)
```
Handling this forces us to be careful in that pass about whether a value is used, and whether the type matters (for example, we can't replace a unary with its child in all cases, if the return value matters).
|
|
|
|
|
| |
Trying to refactor the code to be simpler and less redundant, I ran into some perf issues that it seems like a small vector, with fixed-size storage and optional additional storage as needed, might help with. This implements that class and uses it in a few places.
This seems to help, I see some 1-2% fewer instructions and cycles in `perf stat`, but it's hard to tell if it really makes a noticeable difference.
|
| |
|
|
|
|
|
|
|
|
|
| |
* make DE_NAN avoid creating nan literals in the first place
* add a reducer option `--denan` to not introduce nans in destructive reduction
* add a `Literal::isNaN()` method
* also remove the default exception logging from the fuzzer js glue, which is a source of non-useful VM differences (like nan nondeterminism)
* added an option `--no-fuzz-nans` to make it easy to avoid nans when fuzzing (without hacking the source and recompiling).
Background: trying to get fuzzing on jsc working despite this open issue: https://bugs.webkit.org/show_bug.cgi?id=175691
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A user that just does
```
wasm-opt input.wasm -O
```
may assume that the input file should have been optimized. But without `-o` we don't emit any output.
Often you may not want any output, like if you just want to run a pass like `--metrics`. But for most users wasm-opt is probably going to be used as an optimizer of files. So this PR suggests we emit a warning in that case.
For comparison, `llvm-opt` would print to the console, but it avoids printing a binary there so it issues a warning. Instead of this warning, perhaps we should do the same? That would also not be confusing.
Closes #1907
|
| |
|
|
|
|
| |
also remove some old debugging
|
|
|
| |
See [emscripten-core/emscripten#7679
|
| |
|
|
|
| |
Fixes #1900
|
|
|
|
|
| |
We should emit a file with only the data segments, starting from the global base, and not starting from zero (the data before is unneeded, and the emscripten loading code assumes it isn't there).
Also fix the auto updater to work properly on .mem test updating.
|
|
|
| |
We landed two PRs that had a logic conflict but not a source conflict (bulk memory added ops, comparison optimization removed the need for PUSH ops that bulk memory added).
|
| |
|
|
|
| |
The hardcoded 16 size was no longer valid. This was broken for a while, but happened to not overwrite important memory. Testing with the wasm backend did hit breakage.
|
|\
| |
| |
| | |
each new node (#1895)
|
| |
| |
| |
| | |
each new node
|
| |
| |
| |
| |
| | |
(#1894)
Fixes #1893
|
| |
| |
| |
| |
| |
| | |
Bulk memory operations
The only parts missing are the interpreter implementation
and spec tests.
|
|/
|
| |
To calculate the metadata, we must look at the segments. If we split them out earlier (which we do for threads), they aren't there.
|
|
|
|
|
|
|
|
| |
WebAssembly/tool-conventions#93 has a summary of emscripten's current thinking on this. For Binaryen, we don't want to do anything to the producers section by default, but do want it to be possible to optionally remove it. To achieve that, this PR
* creates a --strip-producers pass that removes that section.
* creates a --strip-debug pass that removes debug info, same as the old --strip, which is still around but deprecated.
A followup in emscripten will use this pass by default.
|
|
|
| |
Before this, we just did not emit illegal dynCalls. This was wrong as we do need them (e.g. if a function with a setjmp call calls a function with an i64 param - we'll have an invoke with that signature there). We just need to legalize them. This fixes that by first emitting them, and second by running legalization late, after dynCalls have been generated, so it legalizes them too.
|
|
|
|
|
|
|
|
|
|
| |
FuncCastEmulation supports a hardcoded number of parameters:
// This should be enough for everybody. (As described above, we need this
// to match when dynamically linking, and also dynamic linking is why we
// can't just detect this automatically in the module we see.)
static const int NUM_PARAMS = 15;
Turns out 15 is not enough for everybody: Ruby 2.6.0 needs NUM_PARAMS = 16. This patch is necessary to support Ruby 2.6.0 in WebAssembly, and in fact is the only patch needed to make the relevant build process work with an otherwise normal emscripten toolchain.
|
|
|
|
|
|
|
| |
See emscripten-core/emscripten#7928 - we have been optimizing all wasms until now, and noticed this when the wasm object file path did not do so. When not optimizing, our methods of handling EM_ASM and EM_JS fail since the patterns are different.
Specifically, for EM_ASM we hunt for emscripten_asm_const(X, where X is a constant, but without opts it may be a get of a local. For EM_JS, the function body may not just contain a const, but a block with a set of the const and a return of a get later.
This adds logic to track gets and sets in basic blocks, which is sufficient to handle this.
|