| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Fuzzer shows the initial contents and prompts the user if they want to
proceed after #4173, but when the fuzzer is used within a script called
from `wasm-reduce`, we shouldn't pause for the user input. This shows
the prompt only when there is no seed given.
To do that, we now initialize the important initial contents from
`main`. We used to assign those variables before we start `main`.
|
|
|
|
|
|
|
|
|
| |
See #4149
This modifies the test added in #4163 which used static casts on
dynamically-created structs and arrays. That was technically not
valid (as we won't want users to "mix" the two forms). This makes that
test 100% static, which both fixes the test and gives test coverage
to the new instructions added here.
|
|
|
|
|
|
|
|
|
| |
This is another attempt to address #4073. Instead of relying on the
timestamp, this examines git log to gather the list of test files added
or modified within some fixed number of days. The number of days is
currently set to 30 (= 1 month) but can be changed. This will be enabled
by `--auto-initial-contents`, which is now disabled by default.
Hopefully fixes #4073.
|
|
|
|
|
|
|
|
|
|
|
|
| |
We added an optional ReFinalize in OptimizeInstructions at some point,
but that is not valid: The ReFinalize only updates types when all other
works is done, but the pass works incrementally. The bug the fuzzer found
is that a child is changed to be unreachable, and then the parent is
optimized before finalize() is called on it, which led to an assertion being
hit (as the child was unreachable but not the parent, which should also
be).
To fix this, do not change types in this pass. Emit an extra block with a
declared type when necessary. Other passes can remove the extra block.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR helps with functions like this:
function foo(x) {
if (x) {
..
lots of work here
..
}
}
If "lots of work" is large enough, then we won't inline such a
function. However, we may end up calling into the function
only to get a false on that if and immediately exit. So it is useful
to partially inline this function, basically by creating a split
of it into a condition part that is inlineable
function foo$inlineable(x) {
if (x) {
foo$outlined();
}
}
and an outlined part that is not inlineable:
function foo$outlined(x) {
..
lots of work here
..
}
We can then inline the inlineable part. That means that a call
like
foo(param);
turns into
if (param) {
foo$outlined();
}
In other words, we end up replacing a call and then a check with
a check and then a call. Any time that the condition is false, this
will be a speedup.
The cost here is increased size, as we duplicate the condition
into the callsites. For that reason, only do this when heavily
optimizing for size.
This is a 10% speedup on j2cl. This helps two types of functions
there: Java class inits, which often look like "have I been
initialized before? if not, do all this work", and also assertion
methods which look like "if the input is null, throw an
exception".
|
|
|
|
|
|
| |
Avoids a crash in calling getHeapType when there isn't one.
Also add the relevant lit test (and a few others) to the list of files to
fuzz more heavily.
|
|
|
|
|
|
|
|
|
|
|
|
| |
tablify() attempts to turns a sequence of br_ifs into a single
br_table. This PR adds some flexibility to the specific pattern it
looks for, specifically:
* Accept i32.eqz as a comparison to zero, and not just to look
for i32.eq against a constant.
* Allow the first condition to be a tee. If it is, compare later
conditions to local.get of that local.
This will allow more br_tables to be emitted in j2cl output.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some functions run only once with this pattern:
function foo() {
if (foo$ran) return;
foo$ran = 1;
...
}
If that global is not ever set to 0, then the function's payload (after the
initial if and return) will never execute more than once. That means we
can optimize away dominated calls:
foo();
foo(); // we can remove this
To do this, we find which globals are "once", which means they can
fit in that pattern, as they are never set to 0. If a function looks like the
above pattern, and it's global is "once", then the function is "once" as
well, and we can perform this optimization.
This removes over 8% of static calls in j2cl.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we catches "Output must be deterministic" we can't see any details. This PR fix
this and now we can see diff of b1.wasm and b2.wasm files.
Example output:
Output must be deterministic.
Diff:
--- expected
+++ actual
@@ -2072,9 +2072,7 @@
)
(drop
(block $label$16 (result funcref)
- (local.set $10
- (ref.null func)
- )
+ (nop)
(drop
(call $22
(f64.const 0.296)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Technically this is not a new pass, but it is a rewrite almost from scratch.
Local Common Subexpression Elimination looks for repeated patterns,
stuff like this:
x = (a + b) + c
y = a + b
=>
temp = a + b
x = temp + c
y = temp
The old pass worked on flat IR, which is inefficient, and was overly
complicated because of that. The new pass uses a new algorithm that
I think is pretty simple, see the detailed comment at the top.
This keeps the pass enabled only in -O4, like before - right after
flattening the IR. That is to make this as minimal a change as possible.
Followups will enable the pass in the main pipeline, that is, we will
finally be able to run it by default. (Note that to make the pass work
well after flatten, an extra simplify-locals is added - the old pass used
to do part of simplify-locals internally, which was one source of
complexity. Even so, some of the -O4 tests have changes, due to
minor factors - they are just minor orderings etc., which can be
seen by inspecting the outputs before and after using e.g.
--metrics)
This plus some followup work leads to large wins on wasm GC output.
On j2cl there is a common pattern of repeated struct.gets, so common
that this pass removes 85% of all struct.gets, which makes the total
binary 15% smaller. However, on LLVM-emitted code the benefit is
minor, less than 1%.
|
| |
|
|
|
|
|
|
|
| |
Now that the features section adds on top of the commandline arguments,
it means the way we test if initial contents are ok to use will not work if
the wasm has a features section - as it will enable a feature, even if
we wanted to see if the wasm can work without that feature. To fix this,
strip the features section there.
|
|
|
|
|
|
| |
This works around the issue with wasm gc types sometimes getting
truncated (as the default names can be very long or even infinitely
recursive). If the truncation leads to name collision, the wast is not
valid.
|
|
|
|
|
|
|
|
|
|
| |
The features section is additive since #3960. For the fuzzer to know which features
are used, it therefore needs to also scan the features section. To do this,
run --print-features to get the total features used from both flags + the
features section.
A result of this is that we now have a list of enabled features instead of
"enable all, then disable". This is actually clearer I think, but it does require
inverting the logic in some places.
|
|
|
|
|
|
| |
Emscripten stopped emitting shell support code by default (as most users
run node.js, but here we are literally fuzzing d8).
Fixes #3967
|
|
|
|
|
|
|
| |
The support code there emits "low high" as the result, for example, 25 0 would
be 25 (as the high bits are all 0). This is different than how numbers are reported
in other things we fuzz, so this caused an error.
Fixes #3915
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we allocate some GC data, and do not let the reference escape, then we can
replace the allocation with locals, one local for each field in the allocation
basically. This avoids the allocation, and also allows us to optimize the locals
further.
On the Dart DeltaBlue benchmark, this is a 24% speedup (making it faster than
the JS version, incidentially), and also a 6% reduction in code size.
The tests are not the best way to show what this does, as the pass assumes
other passes will clean up after. Here is an example to clarify. First, in pseudocode:
ref = new Int(42)
do {
ref.set(ref.get() + 1)
} while (import(ref.get())
That is, we allocate an int on the heap and use it as a counter. Unnecessarily,
as it could be a normal int on the stack.
Wat:
(module
;; A boxed integer: an entire struct just to hold an int.
(type $boxed-int (struct (field (mut i32))))
(import "env" "import" (func $import (param i32) (result i32)))
(func "example"
(local $ref (ref null $boxed-int))
;; Allocate a boxed integer of 42 and save the reference to it.
(local.set $ref
(struct.new_with_rtt $boxed-int
(i32.const 42)
(rtt.canon $boxed-int)
)
)
;; Increment the integer in a loop, looking for some condition.
(loop $loop
(struct.set $boxed-int 0
(local.get $ref)
(i32.add
(struct.get $boxed-int 0
(local.get $ref)
)
(i32.const 1)
)
)
(br_if $loop
(call $import
(struct.get $boxed-int 0
(local.get $ref)
)
)
)
)
)
)
Before this pass, the optimizer could do essentially nothing with this.
Even with this pass, running -O1 has no effect, as the pass is only
used in -O2+. However, running --heap2local -O1 leads to this:
(func $0
(local $0 i32)
(local.set $0
(i32.const 42)
)
(loop $loop
(br_if $loop
(call $import
(local.tee $0
(i32.add
(local.get $0)
(i32.const 1)
)
)
)
)
)
)
All the GC heap operations have been removed, and we just
have a plain int now, allowing a bunch of other opts to run. That
output is basically the optimal code, I think.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We only ignored known issues if the process failed. However, some things
do not bring the process down, but are still necessary to ignore, which this
fixes.
Another approach might be to make all the things we need to ignore fail
the entire process. However, that could be annoying for other debugging:
we don't want a host limit on say hitting a VM limit on recursion to bring
down the entire process, as those limits manifest as traps, and we can
still run after them (and do need to test that). The specific host limit that
made me fix this was the trap on OOM when trying to allocate an array
of size 4GB.
|
|
|
|
|
|
|
|
|
|
| |
There is a conflict between multivalue and GC, see the details in the
comment. There isn't a good way to get the fuzzer to avoid the combination
of them, and GC is more urgent, so disable multivalue in that area for now.
(This does not disable all multivalue fuzzing - the fuzzer can still emit stuff.
This just disables initial content from test suite having multivalue, which
is enough for now, until the fuzzer can emit more GC things, and then we'll
need to do more.)
|
|
|
|
|
|
|
|
|
| |
Host limitations are arbitrary and can be modified by optimizations, so
ignore them. For example, if the optimizer removes allocations then a
host limit on an allocation error may vanish. Or, an optimization that
removes recursion and replaces it with a loop may avoid a host limit
on call depth (that is not done currently, but might some day).
This removes a class of annoying false positives in the fuzzer.
|
|
|
|
|
|
| |
The fuzzer doesn't generate much GC code yet, but it does fuzz things in
the test suite and adds fuzz to them. This PR allows GC when using initial
content, and also in CompareVMs, both of which have been fuzzed for
days locally for me with no issues.
|
| |
|
|
|
|
|
|
| |
We give br_if a too specific type: #3767
This is only noticeable with GC, and in rare cases where the type of br_if
is actually used - which realistically it never is, so really just fuzzer testcases.
|
|
|
| |
RTTs are not defaultable, and we cannot spill them to locals.
|
|
|
|
|
|
|
|
|
| |
The problem is that a tuple with a non-nullable element cannot be stored
to a local. We'd need to split up the tuple, but that raises questions about
what should be allowed in flat IR (we'd need to allow nested tuple ops
in more places). That combination doesn't seem urgent, so add a clear
error for now, and avoid it in the fuzzer.
Avoids #3759 in the fuzzer
|
| |
|
|
|
|
|
| |
The check for a valid wasm file must be different if the wasm has
a feature section or not, so just try both ways, with --detect-features
and --all-features. If the wasm is valid, at least one will work.
|
|
|
|
| |
And demonstrate its capabilities by porting all tests of the
optimize-instructions pass to use lit and FileCheck.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds rtt.canon and rtt.sub together with RTT type support
that is necessary for them. Together this lets us test roundtripping the
instructions and types.
Also fixes a missing traversal over globals in collectHeapTypes,
which the example from the GC docs requires, as the RTTs are in
globals there.
This does not yet add full interpreter support and other things. It
disables initial contents on GC in the fuzzer, to avoid the fuzzer
breaking.
Renames the binary ID for exnref, which is being removed from
the spec, and which overlaps with the binary ID for rtt.
|
|
|
|
|
|
|
| |
When running d8, run it in liftoff, to avoid tiering up causing nondeterminism
in the results.
When we do want to compare the tiers, we already do so in CompareVMs. This
fixes others places where we just wanted to run some JS in some VM.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bugs (#3401)
* Count signatures in tuple locals.
* Count nested signature types (confirming @aheejin was right, that was missing).
* Inlining was using the wrong type.
* OptimizeInstructions should return -1 for unhandled types, not error.
* The fuzzer should check for ref types as well, not just typed function references,
similar to what GC does.
* The fuzzer now creates a function if it has no other option for creating a constant
expression of a function type, then does a ref.func of that.
* Handle unreachability in call_ref binary reading.
* S-expression parsing fixes in more places, and add a tiny fuzzer for it.
* Switch fuzzer test to just have the metrics, and not print all the fuzz output which
changes a lot. Also fix noprint handling which only worked on binaries before.
* Fix Properties::getLiteral() to use the specific function type properly, and make
Literal's function constructor require that, to prevent future bugs.
* Turn all input types into nullable types, for now.
|
|
|
|
|
|
|
|
|
| |
Previously we picked one of the two compilers at the top level. But that doesn't
actually compare between them directly - each entire run used one of the two.
Instead, add separate "VMs" for each of them, and keep the existing D8 VM as
well (which tests tiering up).
The code also seems nicer this way.
|
|
|
|
|
|
|
|
|
|
|
|
| |
OptimizeInstructions is seeing the most work these days, so it's good for
the fuzzer to focus on that some more.
Also move some code around in the main test wast: it's useful to put each
feature in its own module to maximize the chance of getting them to be used.
That is, if a module has a single use of atomics, then if atomics are disabled
in the current run, we can't use any of the module and we skip initial contents
entirely. Moving each feature to it's own module reduces that risk. (We do
pick randomly between the modules, and atm a small module has the same
chance as a big one, but this still seems worth it.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the fuzzer constructed a new random valid wasm file from
scratch. The new --initial-fuzz=FILENAME option makes it start from
an existing wasm file, and then add random contents on top of that. It
also randomly modifies the existing contents, for example tweaking
a Const, replacing some nodes with other things of the same type, etc.
It also has a chance to replace a drop with a logging (as some of our
tests just drop a result, and we match the optimized output's wasm
instead of the result; by logging, the fuzzer can check things).
The goal is to find bugs by using existing hand-written testcases as
a basis. This PR uses the test suite's testcases as initial fuzz contents.
This can find issues as they often check for corner cases - they are
designed to be "interesting", which random data may be less likely
to find.
This has found several bugs already, see recent fuzz fixes. I mentioned
the first few on Twitter but past 4 I stopped counting...
https://twitter.com/kripken/status/1314323318036602880
This required various changes to the fuzzer's generation to account
for the fact that there can be existing functions and so forth before
it starts to run, so it needs to avoid collisions and so forth.
|
|
|
|
| |
We used to either apply all, or pick each at random. Also add a chance
to pick none at all.
|
|
|
|
| |
If we can't run the -ttf stage, there is no point in printing out the
instructions to reduce things - we can't reduce without a wasm.
|
|
|
|
| |
(#3269)
|
|
|
|
|
|
| |
This PR contains:
- Changes that enable/disable tests on Windows to allow for better local testing.
- Also changes many abort() into Fatal() when it is really just exiting on error. This is because abort() generates a dialog window on Windows which is not great in automated scripts.
- Improvements to CMake to better work with the project in IDEs (VS).
|
|
|
| |
Adds the `--enable-gc` feature flag, so far enabling the `anyref` type incl. subtyping, and removes the temporary `--enable-anyref` feature flag that it replaces.
|
|
|
| |
Since #3050 the fuzzer can test out-of-tree builds using the `--binaryen-bin` argument, but the argument was not yet added to the generated `reduce.sh` on fuzzing failures. This change adds it.
|
|
|
| |
Adds `anyref` type, which is enabled by a new feature `--enable-anyref`. This type is primarily used for testing that passes correctly handle subtype relationships so that the codebase will continue to be prepared for future subtyping. Since `--enable-anyref` is meaningless without also using `--enable-reference-types`, this PR also makes it a validation error to pass only the former (and similarly makes it a validation error to enable exception handling without enabling reference types).
|
|
|
|
|
|
|
|
| |
Can now run scripts/fuzz_opt.py --binaryen-bin build/bin [opts...] to fuzz an
out-of-tree build
Handle positional arguments by looking at shared.requested (with options removed)
instead of raw sys.argv
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Comparing to the interpreter, and not just wasm2js to itself (which we've
done on the same file before and after opts) ensures wasm2js has the right
semantics.
To do this, we need to make sure the wasm doesn't contain things where
wasm2js semantics diverge from normal wasm, which includes:
* Legalize so that there are no i64 exports.
* Remove operations JS can't handle with full precision, like i64 -> f32.
* Force all loads/stores to be 1-byte, as unexpectedly-unaligned operations
fail in wasm2js.
This also requires ignoring subnormals when comparing between JS
VMs and the interpreter.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wasm2js fuzzing should not compare outputs if the wasm would
trap. wasm2js traps on far fewer things, and if wasm would trap
(like an indirect call with the wrong type) it can just do weird undefined
things. Previously, if running wasm2js trapped then we ignored
the output, but that't not good enough, as we need to check if
wasm would, exactly for the cases just mentioned where wasm
would trap but wasm2js wouldn't. So run the wasm interpreter
to see if that happens.
When we see such a trap, ignore everything from that function
call onwards. This at least lets us compare the results of
previous calls, which adds some amount of coverage (before
we just ignored the entire output completely, so only if there
was no trap at all did we do any comparisons at all).
Also give better names than "js.js" to the JS files wasm2js
fuzzing creates.
|
|
|
|
| |
compilers (#2961)
|
|
|
|
|
|
| |
Use WASM_RT_SETJMP so we use sigsetjmp when we need to.
Also disable signals in emcc+wasm2c in the fuzzer. emcc looks like
unix, so it enters the ifdef to use signals, but wasm has no signals...
|
| |
|
|
|
|
|
|
|
|
|
| |
This finds out which locals are live at call sites that might pause/resume,
which is the set of locals we need to actually save/load. That is, if a local
is not alive at any call site in the function, then it's value doesn't need to
stay alive while sleeping.
This saves about 10% of locals that are saved/loaded, and about 1.5%
in final code size.
|