summaryrefslogtreecommitdiff
path: root/scripts/fuzz_opt.py
Commit message (Collapse)AuthorAgeFilesLines
...
* Fuzzer: Add some docs and error handling" (#4887)Alon Zakai2022-08-091-4/+19
|
* Remove RTTs (#4848)Thomas Lively2022-08-051-8/+9
| | | | | | | RTTs were removed from the GC spec and if they are added back in in the future, they will be heap types rather than value types as in our implementation. Updating our implementation to have RTTs be heap types would have been more work than deleting them for questionable benefit since we don't know how long it will be before they are specced again.
* Grand Unified Flow Analysis (GUFA) (#4598)Alon Zakai2022-07-221-0/+2
| | | | | | | | | | | | | This tracks the possible contents in the entire program all at once using a single IR. That is in contrast to say DeadArgumentElimination of LocalRefining etc., all of whom look at one particular aspect of the program (function params and returns in DAE, locals in LocalRefining). The cost is to build up an entire new IR, which takes a lot of new code (mostly in the already-landed PossibleContents). Another cost is this new IR is very big and requires a lot of time and memory to process. The benefit is that this can find opportunities that are only obvious when looking at the entire program, and also it can track information that is more specialized than the normal type system in the IR - in particular, this can track an ExactType, which is the case where we know the value is of a particular type exactly and not a subtype.
* Fuzzer: Fix determinism fuzzing (#4767)Alon Zakai2022-07-071-6/+14
| | | | | | #4758 regressed the determinism fuzzer. First, FEATURE_OPTS is not a list but a string. Second, as a hack another location would add --strip-dwarf to that list, but that only works in wasm-opt. To fix that, remove the hack and instead just ignore DWARF-containing files.
* [Strings] Add string proposal types (#4755)Alon Zakai2022-06-291-1/+3
| | | | | | | | This starts to implement the Wasm Strings proposal https://github.com/WebAssembly/stringref/blob/main/proposals/stringref/Overview.md This just adds the types.
* Disallow --nominal with GC (#4758)Thomas Lively2022-06-281-12/+11
| | | | | | | | | | | Nominal types don't make much sense without GC, and in particular trying to emit them with typed function references but not GC enabled can result in invalid binaries because nominal types do not respect the type ordering constraints required by the typed function references proposal. Making this change was mostly straightforward, but required fixing the fuzzer to use --nominal only when GC is enabled and required exiting early from nominal-only optimizations when GC was not enabled. Fixes #4756.
* [Fuzzer] Pretty print of feature opts and passes (#4740)Max Graey2022-06-221-2/+2
|
* Disable Wasm2C in fuzz_opt for now (#4743)Max Graey2022-06-211-2/+4
| | | Relates to #4741
* Global Struct Inference pass: Infer two constants in struct.get (#4659)Alon Zakai2022-06-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | This optimizes constants in the megamorphic case of two: when we know two function references are possible, we could in theory emit this: (select (ref.func A) (ref.func B) (ref.eq (..ref value..) ;; globally, only 2 things are possible here, and one has ;; ref.func A as its value, and the other ref.func B (ref.func A)) That is, compare to one of the values, and emit the two possible values there. Other optimizations can then turn a call_ref on this select into an if over two direct calls, leading to devirtualization. We cannot compare a ref.func directly (since function references are not comparable), and so instead we look at immutable global structs. If we find a struct type that has only two possible values in some field, and the structs are in immutable globals (which happens in the vtable case in j2wasm for example), then we can compare the references of the struct to decide between the two values in the field.
* [Fuzzer] Ignore relaxed-simd test for initial contents (#4683)Alon Zakai2022-05-241-0/+8
| | | | If we use it as initial contents, we will try to execute it, and hit the TODOs in the interpreter for unimplemented parts.
* Fuzzer: Show a clear error if a given wasm fails to run (#4672)Alon Zakai2022-05-171-1/+5
| | | | | | Without this the error from a bad given wasm file - either by the user, or during a reduction where smaller wasms are given - could be very confusing. A bad given wasm file during reduction, in particular, indicates a bug in the reducer most likely.
* [Fuzzer] Add typesystem flag to reduction script (#4651)Alon Zakai2022-05-061-2/+3
| | | | | Without this the reduction will fail on not being able to parse the input file, if the input file depends on nominal typing.
* [NominalFuzzing] Add typesystem flag to wasm-dis (#4645)Alon Zakai2022-05-051-2/+2
| | | | | wasm-dis does enable all features by default, so we don't need the feature flags, but we do need --nominal etc. since we emit such modules now.
* [NominalFuzzing] Use feature flags in a missing place (#4628)Alon Zakai2022-05-021-1/+1
| | | | | | | Instead of a raw run command, use the helper function, which adds the feature flags. That adds --nominal which is needed more now after #4625 This fixes the fuzz failures mentioned in #4625 (comment)
* [NominalFuzzing] Switch to nominal fuzzing by default (#4610)Alon Zakai2022-04-221-3/+13
| | | Add a flag to make it easy to pick which typesystem to test.
* [Wasm GC] Signature Pruning (#4545)Alon Zakai2022-03-251-0/+1
| | | | | | | | | | | | | This adds a new signature-pruning pass that prunes parameters from signature types where those parameters are never used in any function that has that type. This is similar to DeadArgumentElimination but works on a set of functions, and it can handle indirect calls. Also move a little code from SignatureRefining into a shared place to avoid duplication of logic to update signature types. This pattern happens in j2wasm code, for example if all method functions for some virtual method just return a constant and do not use the this pointer.
* [EH] Enable fuzzer with initial contents (#4409)Heejin Ahn2022-01-041-7/+6
| | | | | | | | | This enables fuzzing EH with initial contents. fuzzing.cpp/h does not yet support generation of EH instructions, but with this we can still fuzz EH based on initial contents. The fuzzer ran successfully for more than 1,900,000 iterations, with my local modification that always enables EH and lets the fuzzer select only EH tests for its initial contents.
* Handle removed/renamed initial contents in fuzzer (#4335)Alon Zakai2021-11-161-0/+6
|
* Fuzzer: Add --auto-initial-contents automatically to reduction script (#4196)Alon Zakai2021-09-301-1/+5
|
* [Fuzzer] Don't prompt user when seed is given (#4185)Heejin Ahn2021-09-241-8/+16
| | | | | | | | | Fuzzer shows the initial contents and prompts the user if they want to proceed after #4173, but when the fuzzer is used within a script called from `wasm-reduce`, we shouldn't pause for the user input. This shows the prompt only when there is no seed given. To do that, we now initialize the important initial contents from `main`. We used to assign those variables before we start `main`.
* [Wasm GC] Implement static (rtt-free) StructNew, ArrayNew, ArrayInit (#4172)Alon Zakai2021-09-231-0/+1
| | | | | | | | | See #4149 This modifies the test added in #4163 which used static casts on dynamically-created structs and arrays. That was technically not valid (as we won't want users to "mix" the two forms). This makes that test 100% static, which both fixes the test and gives test coverage to the new instructions added here.
* Select initial contents automatically in fuzzer (#4173)Heejin Ahn2021-09-221-17/+82
| | | | | | | | | This is another attempt to address #4073. Instead of relying on the timestamp, this examines git log to gather the list of test files added or modified within some fixed number of days. The number of days is currently set to 30 (= 1 month) but can be changed. This will be enabled by `--auto-initial-contents`, which is now disabled by default. Hopefully fixes #4073.
* [Wasm GC] Fix invalid intermediate IR in OptimizeInstructions (#4169)Alon Zakai2021-09-201-0/+2
| | | | | | | | | | | | We added an optional ReFinalize in OptimizeInstructions at some point, but that is not valid: The ReFinalize only updates types when all other works is done, but the pass works incrementally. The bug the fuzzer found is that a child is changed to be unreachable, and then the parent is optimized before finalize() is called on it, which led to an assertion being hit (as the child was unreachable but not the parent, which should also be). To fix this, do not change types in this pass. Emit an extra block with a declared type when necessary. Other passes can remove the extra block.
* Partial inlining via function splitting (#4152)Alon Zakai2021-09-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This PR helps with functions like this: function foo(x) { if (x) { .. lots of work here .. } } If "lots of work" is large enough, then we won't inline such a function. However, we may end up calling into the function only to get a false on that if and immediately exit. So it is useful to partially inline this function, basically by creating a split of it into a condition part that is inlineable function foo$inlineable(x) { if (x) { foo$outlined(); } } and an outlined part that is not inlineable: function foo$outlined(x) { .. lots of work here .. } We can then inline the inlineable part. That means that a call like foo(param); turns into if (param) { foo$outlined(); } In other words, we end up replacing a call and then a check with a check and then a call. Any time that the condition is false, this will be a speedup. The cost here is increased size, as we duplicate the condition into the callsites. For that reason, only do this when heavily optimizing for size. This is a 10% speedup on j2cl. This helps two types of functions there: Java class inits, which often look like "have I been initialized before? if not, do all this work", and also assertion methods which look like "if the input is null, throw an exception".
* [Wasm GC] Fix OptimizeInstructions on unreachable ref.test (#4156)Alon Zakai2021-09-151-0/+3
| | | | | | Avoids a crash in calling getHeapType when there isn't one. Also add the relevant lit test (and a few others) to the list of files to fuzz more heavily.
* RemoveUnusedBrs::tablify() improvements: handle EqZ and tee (#4144)Alon Zakai2021-09-131-0/+1
| | | | | | | | | | | | tablify() attempts to turns a sequence of br_ifs into a single br_table. This PR adds some flexibility to the specific pattern it looks for, specifically: * Accept i32.eqz as a comparison to zero, and not just to look for i32.eq against a constant. * Allow the first condition to be a tee. If it is, compare later conditions to local.get of that local. This will allow more br_tables to be emitted in j2cl output.
* Optimize away dominated calls to functions that run only once (#4111)Alon Zakai2021-09-031-0/+6
| | | | | | | | | | | | | | | | | | | | | | | Some functions run only once with this pattern: function foo() { if (foo$ran) return; foo$ran = 1; ... } If that global is not ever set to 0, then the function's payload (after the initial if and return) will never execute more than once. That means we can optimize away dominated calls: foo(); foo(); // we can remove this To do this, we find which globals are "once", which means they can fit in that pattern, as they are never set to 0. If a function looks like the above pattern, and it's global is "once", then the function is "once" as well, and we can perform this optimization. This removes over 8% of static calls in j2cl.
* [fuzz-opt] Better error output for non-deterministic check (#4102)Max Graey2021-08-261-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | When we catches "Output must be deterministic" we can't see any details. This PR fix this and now we can see diff of b1.wasm and b2.wasm files. Example output: Output must be deterministic. Diff: --- expected +++ actual @@ -2072,9 +2072,7 @@ ) (drop (block $label$16 (result funcref) - (local.set $10 - (ref.null func) - ) + (nop) (drop (call $22 (f64.const 0.296)
* LocalCSE rewrite (#4079)Alon Zakai2021-08-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Technically this is not a new pass, but it is a rewrite almost from scratch. Local Common Subexpression Elimination looks for repeated patterns, stuff like this: x = (a + b) + c y = a + b => temp = a + b x = temp + c y = temp The old pass worked on flat IR, which is inefficient, and was overly complicated because of that. The new pass uses a new algorithm that I think is pretty simple, see the detailed comment at the top. This keeps the pass enabled only in -O4, like before - right after flattening the IR. That is to make this as minimal a change as possible. Followups will enable the pass in the main pipeline, that is, we will finally be able to run it by default. (Note that to make the pass work well after flatten, an extra simplify-locals is added - the old pass used to do part of simplify-locals internally, which was one source of complexity. Even so, some of the -O4 tests have changes, due to minor factors - they are just minor orderings etc., which can be seen by inspecting the outputs before and after using e.g. --metrics) This plus some followup work leads to large wins on wasm GC output. On j2cl there is a common pattern of repeated struct.gets, so common that this pass removes 85% of all struct.gets, which makes the total binary 15% smaller. However, on LLVM-emitted code the benefit is minor, less than 1%.
* Disable RoundtripText fuzzer until #3989 is fixed (#4058)Alon Zakai2021-08-051-1/+2
|
* Fuzzer: Fix handling of features with initial contents (#3981)Alon Zakai2021-07-151-1/+16
| | | | | | | Now that the features section adds on top of the commandline arguments, it means the way we test if initial contents are ok to use will not work if the wasm has a features section - as it will enable a feature, even if we wanted to see if the wasm can work without that feature. To fix this, strip the features section there.
* Fuzzer: Use NameTypes in RoundTrip (#3986)Alon Zakai2021-07-141-1/+5
| | | | | | This works around the issue with wasm gc types sometimes getting truncated (as the default names can be very long or even infinitely recursive). If the truncation leads to name collision, the wast is not valid.
* Fix features section handling in the fuzzer (#3980)Alon Zakai2021-07-131-6/+26
| | | | | | | | | | The features section is additive since #3960. For the fuzzer to know which features are used, it therefore needs to also scan the features section. To do this, run --print-features to get the total features used from both flags + the features section. A result of this is that we now have a list of enabled features instead of "enable all, then disable". This is actually clearer I think, but it does require inverting the logic in some places.
* Fix the fuzzer on wasm2c2wasm: emcc must request d8 support now (#3975)Alon Zakai2021-07-091-0/+1
| | | | | | Emscripten stopped emitting shell support code by default (as most users run node.js, but here we are literally fuzzing d8). Fixes #3967
* Properly handle comparisons of 64-bit ints in wasm2c in the fuzzer. (#3918)Alon Zakai2021-06-081-0/+16
| | | | | | | The support code there emits "low high" as the result, for example, 25 0 would be 25 (as the high bits are all 0). This is different than how numbers are reported in other things we fuzz, so this caused an error. Fixes #3915
* Heap2Local: Use escape analysis to turn heap allocations into local data (#3866)Alon Zakai2021-05-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we allocate some GC data, and do not let the reference escape, then we can replace the allocation with locals, one local for each field in the allocation basically. This avoids the allocation, and also allows us to optimize the locals further. On the Dart DeltaBlue benchmark, this is a 24% speedup (making it faster than the JS version, incidentially), and also a 6% reduction in code size. The tests are not the best way to show what this does, as the pass assumes other passes will clean up after. Here is an example to clarify. First, in pseudocode: ref = new Int(42) do { ref.set(ref.get() + 1) } while (import(ref.get()) That is, we allocate an int on the heap and use it as a counter. Unnecessarily, as it could be a normal int on the stack. Wat: (module ;; A boxed integer: an entire struct just to hold an int. (type $boxed-int (struct (field (mut i32)))) (import "env" "import" (func $import (param i32) (result i32))) (func "example" (local $ref (ref null $boxed-int)) ;; Allocate a boxed integer of 42 and save the reference to it. (local.set $ref (struct.new_with_rtt $boxed-int (i32.const 42) (rtt.canon $boxed-int) ) ) ;; Increment the integer in a loop, looking for some condition. (loop $loop (struct.set $boxed-int 0 (local.get $ref) (i32.add (struct.get $boxed-int 0 (local.get $ref) ) (i32.const 1) ) ) (br_if $loop (call $import (struct.get $boxed-int 0 (local.get $ref) ) ) ) ) ) ) Before this pass, the optimizer could do essentially nothing with this. Even with this pass, running -O1 has no effect, as the pass is only used in -O2+. However, running --heap2local -O1 leads to this: (func $0 (local $0 i32) (local.set $0 (i32.const 42) ) (loop $loop (br_if $loop (call $import (local.tee $0 (i32.add (local.get $0) (i32.const 1) ) ) ) ) ) ) All the GC heap operations have been removed, and we just have a plain int now, allowing a bunch of other opts to run. That output is basically the optimal code, I think.
* Fuzz with maximal inlining some of the time (#3871)Alon Zakai2021-05-101-0/+8
|
* Fuzzer: Ignore things we should ignore even if the process succeeded (#3864)Alon Zakai2021-05-061-16/+22
| | | | | | | | | | | | | We only ignored known issues if the process failed. However, some things do not bring the process down, but are still necessary to ignore, which this fixes. Another approach might be to make all the things we need to ignore fail the entire process. However, that could be annoying for other debugging: we don't want a host limit on say hitting a VM limit on recursion to bring down the entire process, as those limits manifest as traps, and we can still run after them (and do need to test that). The specific host limit that made me fix this was the trap on OOM when trying to allocate an array of size 4GB.
* Fuzzer: Do not fuzz multivalue testcases in initial contents (#3809)Alon Zakai2021-04-151-0/+21
| | | | | | | | | | There is a conflict between multivalue and GC, see the details in the comment. There isn't a good way to get the fuzzer to avoid the combination of them, and GC is more urgent, so disable multivalue in that area for now. (This does not disable all multivalue fuzzing - the fuzzer can still emit stuff. This just disables initial content from test suite having multivalue, which is enough for now, until the fuzzer can emit more GC things, and then we'll need to do more.)
* Fuzzer: Distinguish traps from host limitations (#3801)Alon Zakai2021-04-121-1/+7
| | | | | | | | | Host limitations are arbitrary and can be modified by optimizations, so ignore them. For example, if the optimizer removes allocations then a host limit on an allocation error may vanish. Or, an optimization that removes recursion and replaces it with a loop may avoid a host limit on call depth (that is not done currently, but might some day). This removes a class of annoying false positives in the fuzzer.
* [Wasm GC] Enable more GC fuzzing (#3788)Alon Zakai2021-04-081-6/+7
| | | | | | The fuzzer doesn't generate much GC code yet, but it does fuzz things in the test suite and adds fuzz to them. This PR allows GC when using initial content, and also in CompareVMs, both of which have been fuzzed for days locally for me with no issues.
* Fuzz --converge (#3789)Alon Zakai2021-04-081-0/+3
|
* Fuzzer: Ignore our current bug with the type of br_if (#3775)Alon Zakai2021-04-061-1/+6
| | | | | | We give br_if a too specific type: #3767 This is only noticeable with GC, and in rare cases where the type of br_if is actually used - which realistically it never is, so really just fuzzer testcases.
* Disallow flatten + GC in the fuzzer due to RTTs (#3768)Alon Zakai2021-04-011-0/+3
| | | RTTs are not defaultable, and we cannot spill them to locals.
* Avoid flatten + multivalue + reference types in the fuzzer (#3760)Alon Zakai2021-03-311-0/+3
| | | | | | | | | The problem is that a tuple with a non-nullable element cannot be stored to a local. We'd need to split up the tuple, but that raises questions about what should be allowed in flat IR (we'd need to allow nested tuple ops in more places). That combination doesn't seem urgent, so add a clear error for now, and avoid it in the fuzzer. Avoids #3759 in the fuzzer
* Stop emitting features section in fuzzer (#3652)Alon Zakai2021-03-041-1/+1
|
* Make the reduction script more robust, and document text reduction (#3640)Alon Zakai2021-03-031-5/+17
| | | | | The check for a valid wasm file must be different if the wasm has a feature section or not, so just try both ways, with --detect-features and --all-features. If the wasm is valid, at least one will work.
* Introduce a script for updating lit tests (#3503)Thomas Lively2021-01-211-2/+3
| | | | And demonstrate its capabilities by porting all tests of the optimize-instructions pass to use lit and FileCheck.
* [GC] Add basic RTT support (#3432)Alon Zakai2020-12-081-0/+2
| | | | | | | | | | | | | | | | This adds rtt.canon and rtt.sub together with RTT type support that is necessary for them. Together this lets us test roundtripping the instructions and types. Also fixes a missing traversal over globals in collectHeapTypes, which the example from the GC docs requires, as the RTTs are in globals there. This does not yet add full interpreter support and other things. It disables initial contents on GC in the fuzzer, to avoid the fuzzer breaking. Renames the binary ID for exnref, which is being removed from the spec, and which overlaps with the binary ID for rtt.
* [Fuzzer] Use liftoff when running d8, to avoid nondeterminism (#3402)Alon Zakai2020-12-011-5/+18
| | | | | | | When running d8, run it in liftoff, to avoid tiering up causing nondeterminism in the results. When we do want to compare the tiers, we already do so in CompareVMs. This fixes others places where we just wanted to run some JS in some VM.