| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a special helper functions for data.drop etc., as unlike most
wasm instructions these are too big to emit inline.
Track passive segments at runtime in var memorySegments
whose indexes are the segment indexes.
Emit var bufferView even if the memory exists even without
memory segments, as we do still need the view in order to
operate on it.
Also adds a few constants for atomics that will be useful in future
PRs (as this PR updates the constant lists anyhow).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Necessary preparations for a later PR that adds bulk memory
support to wasm2js (splitting this out so the next PR is less big):
* Fix logging of print functions in wasm2js spec tests, there
was an extra space in the output (which console.log adds
automatically between items).
* Don't assume the output is always empty, as some tests
(like bulk memory) do have expected output.
* Rename test/bulk-memory.wast as it "conflicts" with
test/spec/bulk-memory.wast - the problem is that we scan
both places, and emit files in test/wasm2js/*, so those
would collide if the names overlap.
* Extend the parsing and emitting of JS for (assert.. ) lines such as
(assert_return (invoke "foo" (i32.const 1)) (i32.const 2))
to also handle
(invoke "foo" (i32.const 1)) (i32.const 2))
Without this, we skip (invoke ..) lines in spec tests, which normally is
fine, but in bulk memory at least they have side effects we need - we
must run them before the later assertions.
|
|
|
|
|
|
|
|
|
| |
The plan is that for standlone mode we can function just like wasi-sdk
and call the correct main from crt1.c.
For non-standalone mode we still want to export can call main directly
so we rename __main_argc_argv back to main as part of finalize.
See https://reviews.llvm.org/D70700
|
|
|
|
|
|
|
| |
anyref future semantics were changed to only represent opaque host values, and thus renamed to externref.
[Chromium](https://bugs.chromium.org/p/v8/issues/detail?id=7748#c360) was just updated to today (not yet released). I couldn't find a Mozilla bugzilla ticket mentioning externref so I don't immediately know if they've updated yet.
https://github.com/WebAssembly/reference-types/pull/87
|
|
|
|
|
| |
Reland of #2864
Also ensure a relative install rpath by adding setup to each tool config.
The CMake code is cribbed from LLVM's implementation.
|
|
|
| |
This reverts commit f6b7f0018ca5ce604e94cc6cf50ee712bb7e9b27.
|
|
|
| |
When building the libbinaryen dynamic library, also link the binaryen tools against it. This reduces combined tool size on mac from 76M to 2.8M
|
|
|
|
|
|
| |
This moves the fuzzer de-NaN logic out into a separate pass. This is
cleaner and also better since the old way would de-NaN once, but then
the reducer could generate code with nans. The new way lets us de-NaN
while reducing.
|
|
|
|
|
|
|
| |
These are now implemented in assembly as part of emscripten's
compiler-rt.
See: https://github.com/emscripten-core/emscripten/pull/11166
|
|
|
|
|
|
|
|
|
| |
This adds interpreter support for EH instructions. This adds
`ExceptionPackage` struct, which contains info of a thrown exception (an
event tag and thrown values), and the union in `Literal` can take a
`unique_ptr` to `ExceptionPackage`. We need a destructor, a copy
constructor, and an assignment operator for `Literal`, because the union
in `Literal` now has a member that cannot be trivially copied or
deleted.
|
|
|
|
|
|
| |
The refactoring of the loop in #2812 was wrong - we need to
loop over all the exports and ignore the non-function ones.
Rewrote it to stress that part.
|
| |
|
|
|
|
|
|
| |
Avoid pass-debug when fuzzing emcc, as it can be slow and isn't
what we care about.
Clean up a loop.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had some ad-hoc tuning of which nodes to emit more
frequently in the fuzzer, but it wasn't very good. Things like
loads and stores for example were far too rare. Also it wasn't
easy to adjust the frequencies.
This adds a simple way to adjust them, by passing a size_t
which is the "weight" of that node. Then it just makes that
number of copies of it, making it more likely to be picked.
Example output comparison:
node before after
================================
binary 281 365
block 898 649
break 278 144
call 182 290
call_indirect 9 42
const 808 854
drop 43 92
global.get 440 398
global.set 223 171
if 335 254
load 22 84
local.get 429 301
local.set 434 211
loop 176 99
nop 117 54
return 264 197
select 8 33
store 1 39
unary 405 304
unreachable 1 2
Lots of noise here obviously, but there are large increases
for loads and stores compared to before.
Also add a testcase of random data of the typical size the
fuzzer runs, and print metrics on it. This might help us get
a feel for how future tuning changes affect frequencies.
|
|
|
| |
Since the --roundtrip pass is more general than --fuzz-binary anyways. Also reimplements `ModuleUtils::clearModule` to use the module destructor and placement new to ensure that no members are missed.
|
|
|
|
|
|
|
|
|
| |
This adds a variant on wasm2c that uses emcc instead of a
native compiler. This helps us fuzz emcc.
To make that practical, rewrite the setjmp glue to only use one
setjmp. The wasm backend ends up doing linear work per setjmp,
so it's quadratic with many setjmps. Instead, do a big switch-loop
construct around a single setjmp.
|
|
|
|
|
|
|
| |
Without this we emitted a binary, which confused the size comparisons.
(When reducing a smaller size is usually a good sign. And also it provides
a deterministic way to know when to stop - we can't infinite loop if we keep
going while the size shrinks.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for fuzzing with wabt's wasm2c that @binji wrote.
Basically we compile the wasm to C, then compile the C to a native
executable with a custom main() to wrap around it. The executable
should then print exactly the same as that wasm when run in either
the binaryen interpreter or in a JS VM with our wrapper JS for that
wasm. In other words, compiling the wasm to C is another way to
run that wasm.
The main reasons I want this are to fuzz wasm2c itself, and to
have another option for fuzzing emcc. For the latter, we do fuzz
wasm-opt quite a lot, but that doesn't fuzz the non-wasm-opt
parts of emcc. And using wasm2c for that is nice since the
starting point is always a wasm file, which means we
can use tools like wasm-reduce and so forth, which can be
integrated with this fuzzer.
This also:
Refactors the fuzzer harness a little to make it easier to
add more "VMs" to run wasms in.
Do not autoreduce when re-running a testcase, which I hit
while developing this.
|
|
|
|
|
|
|
|
|
| |
1. Only emit exnref as part of a subtype if exception-handling is
enabled in the fuzzer.
2. Correctly report that funcref and nullref require reference-types
to be enabled.
3. Re-enable multivalue as a normal feature in the fuzzer.
Possibly fixes #2770.
|
|
|
|
|
|
|
|
|
| |
We should only do weird changes to the fuzz code if we
allow out of bounds operations, because the OOB checks
are generated as we build the IR, and changing them can
remove the checks.
(we fuzz 50% of the time with and 50% without OOBs,
so this doesn't really hurt us)
|
|
|
|
|
|
|
|
| |
Emit tuple.make, tuple.extract, and multivalue control flow, and tuple locals
and globals when multivalue is enabled. Also slightly refactors the top-level
`makeConcrete` function to be more selective about what it tries to
make based on the requested type to reduce the number of trivial nodes
created because the requested type is incompatible with the requested
node.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main benefit here is comparing VMs, instead of just comparing
each VM to itself after opts. Comparing VMs is a little tricky since there
is room for nondeterminism with how results are printed and other
annoying things, which is why that didn't work well earlier.
With this PR I can run 10's of thousands of iterations without finding
any issues between v8 and the binaryen interpreter. That's after
fixing the various issues over the last few days as found by this:
#2760 #2757 #2750 #2752
Aside from that main benefit I ended up adding more improvements
to make it practical to do all that testing:
Randomize global fuzz settings like whether we allow NaNs and
out-of-bounds memory accesses. (This was necessary here since
we have to disable cross-VM comparisons if NaNs are enabled.)
Better logging of statistics like how many times each handler
was run.
Remove redundant FuzzExecImmediately handler (looks like
after past refactorings it was no longer adding any value).
Deterministic testcase handling: if you run e.g. fuzz_opt.py 42 it
will run one testcase and exactly the same one. If you run without
an argument it will run forever until it fails, and if it fails, it prints out
that ID so that you can easily reproduce it (I guess, on the same
binaryen + same python, not sure how python's deterministic
RNG changes between versions and builds).
Upgrade to Python 3.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In wasm2js we ignore things that trap in wasm that we can't
really handle, like a load from memory out of bounds would
trap in wasm, but in JS we don't want to emit a bounds check
on each load. So wasm2js focuses on programs that don't
trap.
However, this is annoying in the fuzzer as it turns out that
our behavior for places where wasm would trap was not
deterministic. That is, wasm would trap, wasm2js would not
trap and do behavior X, and wasm2js with optimizations
would also not trap but do behavior Y != X. This produced
false positives in the fuzzer (and might be annoying in
manual debugging too).
As a workaround, this adds a --deterministic flag to wasm2js,
which tries to be deterministic about what it does for cases
where wasm would trap. This handles the case of an int
division by 0 which traps in wasm but without this flag could
have different behavior in wasm2js with or without opts
(see details in the patch).
|
|
|
|
|
|
|
|
|
| |
These seem to be accidentally introduced in when we enforced use of
`Type::` on type names in #2434.
By the way TIL this actually compiles, and don't know why:
```
Type::Type::Type::Type::Type::Type::Type::Type::none
```
|
|
|
|
|
| |
The fuzzer was previously unconditionally emitting one event parameter
more than it was supposed to, which meant multivalue events were
emitted when multivalue was not enabled.
|
|
|
|
|
| |
If wasm-emscripten-finalize is given the BigInt flag, then we will
be using BigInts on the JS side, and need no legalization at all
since i64s will just be BigInts.
|
|
|
|
|
|
| |
Unless the multivalue feature is enabled. The validation for events
recently changed to disallow events returning multiple items unless
the multivalue feature is enabled, but the fuzzer was not updated
accordingly. This PR fixes the glitch.
|
|
|
|
|
|
|
| |
Since it wasn't easy to support tuples in Asyncify's call support
using temporary functions, we decided to allow tuple-typed globals
after all. This PR adds support for parsing, printing, lowering, and
interpreting tuple globals and also adds validation ensuring that
imported and exported globals do not have tuple types.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depends on emscripten-core/emscripten#10741
which ensures that table indexes are unique. With that guarantee,
a main module can just add its function pointers into the table, and
use them based on that index. The loader will then see them in the
table and then give other modules the identical function pointer for
a function, ensuring function pointer equality.
This avoids calling fp$ functions during startup for the main
module's own functions (which are slow). We do still call fp$s
of things we import from outside, as we don't have anything to
put in the table for them, we depend on the loader for that.
I suspect this can also be done with SIDE_MODULES, but did not
want to try too much at once.
|
|
|
|
|
|
| |
The meaning we intend is "constant", and not the "Const"
node (which contains a number). So I think the full
name is less confusing.
|
|
|
|
| |
Also increases the usefulness of a couple wasm-builder methods that
are useful here.
|
|
|
|
|
|
| |
This involves replacing `Literal::makeZero` with `Literal::makeZeroes`
and `Literal::makeSingleZero` and updating `isConstantExpression` to
handle constant tuples as well. Also makes `Literals` its own struct
and adds convenience methods on it.
|
|
|
|
| |
Updates the interpreter to properly flow vectors of values, including
at function boundaries. Adds a small spec test for multivalue return.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Don't print the entire module on an error. Instead, just print
the validation errors.
However, if the user passed --print, then do print it, as otherwise
nothing would get printed - the error would be before the pass
to print happens. And in general a user passing in a request
to print would expect a printed module anyhow.
fixes #2634
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've been throwing some fuzzing at some wasm implementations recently
and one of our strategies is to use `wasm-opt -ttf` with the fuzzer's
input to generate a wasm module. Some of our tests, though, have been
failing due to out-of-memory while `wasm-opt` is generating a module.
This loop appears to be infinitely executing since the input
just-so-happens that `oneIn(3)` returns true for every byte of the input
file, or at least enough such that when the xor factor is merged in it
at least generates very long sequence of `true`.
It looks like elsewhere in the file when `while (oneIn(N))` is used it's
also guarded by `!finishedInput`, so I've added a similar guard here as
well.
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the interpreter trap when the signature in `call_indirect`
instruction and that of the actual function in the table mismatch. This
also makes the `wasm-ctor-eval` not evaluate `call_indirect` in case the
signatures mismatch.
Before we only compared the arguments' signature and the function
signature, which was sufficient before we had subtypes, but now the
signature in `call_indirect` and that of the actual function can be
different even if the argument's signature is OK.
|
|
|
|
|
|
|
|
|
| |
When memory is packed and there are passive segments, bulk memory
operations that reference those segments by index need to be updated to
reflect the new indices and possibly split into multiple instructions
that reference multiple split segments. For some bulk-memory operations,
it is necessary to introduce new globals to explicitly track the drop
state of the original segments, but this PR is careful to only add
globals where necessary.
|
|
|
|
|
|
|
|
|
|
|
|
| |
isBinary was used where we should only accept
a signed binary, as removing the | 0 from an unsigned
value may be incorrect.
This does regress a few small things (as can be seen
in the diff). If it's important we can add more sophisticated
optimizations here, perhaps like an assumption that the
signedness of a local never matters.
Fixes emscripten-core/emscripten#10173
|
|
|
|
|
|
|
|
|
|
| |
* Remove implicit conversion operators from Type
Now types must be explicitly converted to uint32_t with Type::getID or
to ValueType with Type::getVT. This fixes #2572 for switches that use
Type::getVT.
* getVT => getSingle
|
| |
|
| |
|
|
|
|
|
| |
This uses `FeatureSet` in place of `FeatureSet::Feature` when possible,
making it possible for functions take a set of multiple features as one
argument.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the reference type proposal. This includes support
for all reference types (`anyref`, `funcref`(=`anyfunc`), and `nullref`)
and four new instructions: `ref.null`, `ref.is_null`, `ref.func`, and
new typed `select`. This also adds subtype relationship support between
reference types.
This does not include table instructions yet. This also does not include
wasm2js support.
Fixes #2444 and fixes #2447.
|
|
|
|
|
|
|
|
| |
(#2544)
Without this, the first wasm-opt invocation will remove it. But it
can be very large, and we will soon start to automatically do
updating on it when it exists, so avoid the work if we aren't
actually building a final output with dwarf.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optionally track the binary format code section offsets,
that is, when loading a binary, remember where each IR
node was read from. This is necessary for DWARF
debug info, as these are the offsets DWARF refers to.
(Note that eventually we may want to do something
else, like first read the DWARF and only then add
debug info annotations into the IR in a more LLVM-like
manner, but this is more straightforward and should be
enough to update debug lines and ranges).
This tracking adds noticeable overhead - every single
IR node adds an entry in a map - so avoid it unless
actually necessary. Specifically, if the user passes in
-g and there are actually DWARF sections in the
binary, and we are not about to remove those sections,
then we need it.
Print binary format code section offsets in text, when
printing with -g. This will help debug and test dwarf
support. It looks like
;; code offset: 0x7
as an annotation right before each node.
Also add support for -g in wasm-opt tests (unlike
a pass, it has just one - as a prefix).
Helps #2400
|
|
|
|
|
|
|
|
|
| |
In normal mode we call a JS import, but we can't import from JS
in standalone mode. Instead, just trap in that case with an
unreachable. (The error reporting is not as good in this case, but
at least it catches all errors and halts, and the emitted wasm is
valid for standalone mode.)
Helps emscripten-core/emscripten#10019
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)
But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in #2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
(local.tee $0
(ref.func $test)
)
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.
This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.
This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.
Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.
The C and JS APIs and their tests also had to be updated to remove
named function types.
|
| |
|