summaryrefslogtreecommitdiff
path: root/src/tools
Commit message (Collapse)AuthorAgeFilesLines
* wasm2js: Bulk memory support (#2923)Alon Zakai2020-06-221-1/+1
| | | | | | | | | | | | | | Adds a special helper functions for data.drop etc., as unlike most wasm instructions these are too big to emit inline. Track passive segments at runtime in var memorySegments whose indexes are the segment indexes. Emit var bufferView even if the memory exists even without memory segments, as we do still need the view in order to operate on it. Also adds a few constants for atomics that will be useful in future PRs (as this PR updates the constant lists anyhow).
* wasm2js testing improvements before bulk-memory (#2918)Alon Zakai2020-06-201-5/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Necessary preparations for a later PR that adds bulk memory support to wasm2js (splitting this out so the next PR is less big): * Fix logging of print functions in wasm2js spec tests, there was an extra space in the output (which console.log adds automatically between items). * Don't assume the output is always empty, as some tests (like bulk memory) do have expected output. * Rename test/bulk-memory.wast as it "conflicts" with test/spec/bulk-memory.wast - the problem is that we scan both places, and emit files in test/wasm2js/*, so those would collide if the names overlap. * Extend the parsing and emitting of JS for (assert.. ) lines such as (assert_return (invoke "foo" (i32.const 1)) (i32.const 2)) to also handle (invoke "foo" (i32.const 1)) (i32.const 2)) Without this, we skip (invoke ..) lines in spec tests, which normally is fine, but in bulk memory at least they have side effects we need - we must run them before the later assertions.
* Support new clang mangling of main (__main_argc_argv) (#2671)Sam Clegg2020-06-101-0/+3
| | | | | | | | | The plan is that for standlone mode we can function just like wasi-sdk and call the correct main from crt1.c. For non-standalone mode we still want to export can call main directly so we rename __main_argc_argv back to main as part of finalize. See https://reviews.llvm.org/D70700
* Rename anyref to externref to match proposal change (#2900)Jay Phelps2020-06-103-21/+22
| | | | | | | anyref future semantics were changed to only represent opaque host values, and thus renamed to externref. [Chromium](https://bugs.chromium.org/p/v8/issues/detail?id=7748#c360) was just updated to today (not yet released). I couldn't find a Mozilla bugzilla ticket mentioning externref so I don't immediately know if they've updated yet. https://github.com/WebAssembly/reference-types/pull/87
* Reland "Link binaryen tools against the dylib" (#2892)Derek Schuff2020-06-031-1/+0
| | | | | Reland of #2864 Also ensure a relative install rpath by adding setup to each tool config. The CMake code is cribbed from LLVM's implementation.
* Revert "Link binaryen tools against the dylib (#2864)" (#2891)Derek Schuff2020-06-021-0/+1
| | | This reverts commit f6b7f0018ca5ce604e94cc6cf50ee712bb7e9b27.
* Link binaryen tools against the dylib (#2864)Derek Schuff2020-06-021-1/+0
| | | When building the libbinaryen dynamic library, also link the binaryen tools against it. This reduces combined tool size on mac from 76M to 2.8M
* DeNaN pass (#2877)Alon Zakai2020-05-272-111/+55
| | | | | | This moves the fuzzer de-NaN logic out into a separate pass. This is cleaner and also better since the old way would de-NaN once, but then the reducer could generate code with nans. The new way lets us de-NaN while reducing.
* Remove stackSave/stackAlloc/stackRestore code generation (#2852)Sam Clegg2020-05-201-1/+0
| | | | | | | These are now implemented in assembly as part of emscripten's compiler-rt. See: https://github.com/emscripten-core/emscripten/pull/11166
* Add interpreter support for EH (#2780)Heejin Ahn2020-05-062-0/+9
| | | | | | | | | This adds interpreter support for EH instructions. This adds `ExceptionPackage` struct, which contains info of a thrown exception (an event tag and thrown values), and the union in `Literal` can take a `unique_ptr` to `ExceptionPackage`. We need a destructor, a copy constructor, and an assignment operator for `Literal`, because the union in `Literal` now has a member that cannot be trivially copied or deleted.
* Fix wasm2c loop (#2816)Alon Zakai2020-04-281-5/+6
| | | | | | The refactoring of the loop in #2812 was wrong - we need to loop over all the exports and ignore the non-function ones. Rewrote it to stress that part.
* Use --detect-features in wasm-reduce. Fixes #2813 (#2815)Alon Zakai2020-04-281-6/+2
|
* Emcc fuzzing followups (#2812)Alon Zakai2020-04-271-4/+3
| | | | | | Avoid pass-debug when fuzzing emcc, as it can be slow and isn't what we care about. Clean up a loop.
* Fuzz frequency tuning (#2806)Alon Zakai2020-04-271-97/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had some ad-hoc tuning of which nodes to emit more frequently in the fuzzer, but it wasn't very good. Things like loads and stores for example were far too rare. Also it wasn't easy to adjust the frequencies. This adds a simple way to adjust them, by passing a size_t which is the "weight" of that node. Then it just makes that number of copies of it, making it more likely to be picked. Example output comparison: node before after ================================ binary 281 365 block 898 649 break 278 144 call 182 290 call_indirect 9 42 const 808 854 drop 43 92 global.get 440 398 global.set 223 171 if 335 254 load 22 84 local.get 429 301 local.set 434 211 loop 176 99 nop 117 54 return 264 197 select 8 33 store 1 39 unary 405 304 unreachable 1 2 Lots of noise here obviously, but there are large increases for loads and stores compared to before. Also add a testcase of random data of the typical size the fuzzer runs, and print metrics on it. This might help us get a feel for how future tuning changes affect frequencies.
* Remove --fuzz-binary and simplify round trip (#2799)Thomas Lively2020-04-241-34/+5
| | | Since the --roundtrip pass is more general than --fuzz-binary anyways. Also reimplements `ModuleUtils::clearModule` to use the module destructor and placement new to ensure that no members are missed.
* Wasm2c2Wasm Fuzzer: wasm2c + emcc (#2791)Alon Zakai2020-04-241-15/+30
| | | | | | | | | This adds a variant on wasm2c that uses emcc instead of a native compiler. This helps us fuzz emcc. To make that practical, rewrite the setjmp glue to only use one setjmp. The wasm backend ends up doing linear work per setjmp, so it's quadratic with many setjmps. Instead, do a big switch-loop construct around a single setjmp.
* Emit text in pass reduction when in text mode (#2792)Alon Zakai2020-04-221-0/+3
| | | | | | | Without this we emitted a binary, which confused the size comparisons. (When reducing a smaller size is usually a good sign. And also it provides a deterministic way to know when to stop - we can't infinite loop if we keep going while the size shrinks.)
* [fuzzing] wasm2c integration (#2772)Alon Zakai2020-04-223-2/+202
| | | | | | | | | | | | | | | | | | | | | | | | | This adds support for fuzzing with wabt's wasm2c that @binji wrote. Basically we compile the wasm to C, then compile the C to a native executable with a custom main() to wrap around it. The executable should then print exactly the same as that wasm when run in either the binaryen interpreter or in a JS VM with our wrapper JS for that wasm. In other words, compiling the wasm to C is another way to run that wasm. The main reasons I want this are to fuzz wasm2c itself, and to have another option for fuzzing emcc. For the latter, we do fuzz wasm-opt quite a lot, but that doesn't fuzz the non-wasm-opt parts of emcc. And using wasm2c for that is nice since the starting point is always a wasm file, which means we can use tools like wasm-reduce and so forth, which can be integrated with this fuzzer. This also: Refactors the fuzzer harness a little to make it easier to add more "VMs" to run wasms in. Do not autoreduce when re-running a testcase, which I hit while developing this.
* Fix issues with Types and Features (#2773)Thomas Lively2020-04-161-1/+3
| | | | | | | | | 1. Only emit exnref as part of a subtype if exception-handling is enabled in the fuzzer. 2. Correctly report that funcref and nullref require reference-types to be enabled. 3. Re-enable multivalue as a normal feature in the fuzzer. Possibly fixes #2770.
* Fix OOB fuzzing (#2769)Alon Zakai2020-04-161-9/+15
| | | | | | | | | We should only do weird changes to the fuzz code if we allow out of bounds operations, because the OOB checks are generated as we build the IR, and changing them can remove the checks. (we fuzz 50% of the time with and 50% without OOBs, so this doesn't really hurt us)
* Emit tuples in the fuzzer (#2695)Thomas Lively2020-04-151-108/+170
| | | | | | | | Emit tuple.make, tuple.extract, and multivalue control flow, and tuple locals and globals when multivalue is enabled. Also slightly refactors the top-level `makeConcrete` function to be more selective about what it tries to make based on the requested type to reduce the number of trivial nodes created because the requested type is incompatible with the requested node.
* Enable cross-VM fuzzing + related improvements to fuzz_opt.py (#2762)Alon Zakai2020-04-151-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main benefit here is comparing VMs, instead of just comparing each VM to itself after opts. Comparing VMs is a little tricky since there is room for nondeterminism with how results are printed and other annoying things, which is why that didn't work well earlier. With this PR I can run 10's of thousands of iterations without finding any issues between v8 and the binaryen interpreter. That's after fixing the various issues over the last few days as found by this: #2760 #2757 #2750 #2752 Aside from that main benefit I ended up adding more improvements to make it practical to do all that testing: Randomize global fuzz settings like whether we allow NaNs and out-of-bounds memory accesses. (This was necessary here since we have to disable cross-VM comparisons if NaNs are enabled.) Better logging of statistics like how many times each handler was run. Remove redundant FuzzExecImmediately handler (looks like after past refactorings it was no longer adding any value). Deterministic testcase handling: if you run e.g. fuzz_opt.py 42 it will run one testcase and exactly the same one. If you run without an argument it will run forever until it fails, and if it fails, it prints out that ID so that you can easily reproduce it (I guess, on the same binaryen + same python, not sure how python's deterministic RNG changes between versions and builds). Upgrade to Python 3.
* Add --deterministic flag to wasm2js, for fuzzing (#2757)Alon Zakai2020-04-131-4/+29
| | | | | | | | | | | | | | | | | | | | | | In wasm2js we ignore things that trap in wasm that we can't really handle, like a load from memory out of bounds would trap in wasm, but in JS we don't want to emit a bounds check on each load. So wasm2js focuses on programs that don't trap. However, this is annoying in the fuzzer as it turns out that our behavior for places where wasm would trap was not deterministic. That is, wasm would trap, wasm2js would not trap and do behavior X, and wasm2js with optimizations would also not trap but do behavior Y != X. This produced false positives in the fuzzer (and might be annoying in manual debugging too). As a workaround, this adds a --deterministic flag to wasm2js, which tries to be deterministic about what it does for cases where wasm would trap. This handles the case of an int division by 0 which traps in wasm but without this flag could have different behavior in wasm2js with or without opts (see details in the patch).
* Remove duplicate Type:: prefixes (NFC) (#2753)Heejin Ahn2020-04-121-6/+6
| | | | | | | | | These seem to be accidentally introduced in when we enforced use of `Type::` on type names in #2434. By the way TIL this actually compiles, and don't know why: ``` Type::Type::Type::Type::Type::Type::Type::Type::none ```
* Fix multivalue event fuzzing (#2748)Thomas Lively2020-04-101-2/+0
| | | | | The fuzzer was previously unconditionally emitting one event parameter more than it was supposed to, which meant multivalue events were emitted when multivalue was not enabled.
* JS/Wasm BigInt support for wasm-emscripten-finalize (#2726)Alon Zakai2020-04-071-2/+10
| | | | | If wasm-emscripten-finalize is given the BigInt flag, then we will be using BigInts on the JS side, and need no legalization at all since i64s will just be BigInts.
* Do not emit multivalue events in fuzzer (#2723)Thomas Lively2020-04-031-2/+3
| | | | | | Unless the multivalue feature is enabled. The validation for events recently changed to disallow events returning multiple items unless the multivalue feature is enabled, but the fuzzer was not updated accordingly. This PR fixes the glitch.
* Tuple globals (#2718)Thomas Lively2020-04-021-10/+10
| | | | | | | Since it wasn't easy to support tuples in Asyncify's call support using temporary functions, we decided to allow tuple-typed globals after all. This PR adds support for parsing, printing, lowering, and interpreting tuple globals and also adds validation ensuring that imported and exported globals do not have tuple types.
* Avoid fp$ access in MAIN_MODULES (#2704)Alon Zakai2020-03-271-6/+7
| | | | | | | | | | | | | | | | Depends on emscripten-core/emscripten#10741 which ensures that table indexes are unique. With that guarantee, a main module can just add its function pointers into the table, and use them based on that index. The loader will then see them in the table and then give other modules the identical function pointer for a function, ensuring function pointer equality. This avoids calling fp$ functions during startup for the main module's own functions (which are slow). We do still call fp$s of things we import from outside, as we don't have anything to put in the table for them, we depend on the loader for that. I suspect this can also be done with SIDE_MODULES, but did not want to try too much at once.
* makeConstExpression => makeConstantExpression (#2698)Alon Zakai2020-03-171-1/+1
| | | | | | The meaning we intend is "constant", and not the "Const" node (which contains a number). So I think the full name is less confusing.
* Handle tuples in wasm-reduce (#2689)Thomas Lively2020-03-161-4/+9
| | | | Also increases the usefulness of a couple wasm-builder methods that are useful here.
* Update Precompute to handle tuples (#2687)Thomas Lively2020-03-102-3/+3
| | | | | | This involves replacing `Literal::makeZero` with `Literal::makeZeroes` and `Literal::makeSingleZero` and updating `isConstantExpression` to handle constant tuples as well. Also makes `Literals` its own struct and adds convenience methods on it.
* Handle multivalue returns in the interpreter (#2684)Thomas Lively2020-03-103-38/+35
| | | | Updates the interpreter to properly flow vectors of values, including at function boundaries. Adds a small spec test for multivalue return.
* Add multivalue feature (#2668)Thomas Lively2020-02-271-0/+1
|
* Concise error output (#2652)Alon Zakai2020-02-182-12/+18
| | | | | | | | | | | Don't print the entire module on an error. Instead, just print the validation errors. However, if the user passed --print, then do print it, as otherwise nothing would get printed - the error would be before the pass to print happens. And in general a user passing in a request to print would expect a printed module anyhow. fixes #2634
* Avoid an infinite loop in `-ttf` generation (#2637)Alex Crichton2020-02-041-1/+1
| | | | | | | | | | | | | | | We've been throwing some fuzzing at some wasm implementations recently and one of our strategies is to use `wasm-opt -ttf` with the fuzzer's input to generate a wasm module. Some of our tests, though, have been failing due to out-of-memory while `wasm-opt` is generating a module. This loop appears to be infinitely executing since the input just-so-happens that `oneIn(3)` returns true for every byte of the input file, or at least enough such that when the xor factor is merged in it at least generates very long sequence of `true`. It looks like elsewhere in the file when `while (oneIn(N))` is used it's also guarded by `!finishedInput`, so I've added a similar guard here as well.
* Trap when call_indirect's signatures mismatch (#2636)Heejin Ahn2020-02-031-0/+5
| | | | | | | | | | | This makes the interpreter trap when the signature in `call_indirect` instruction and that of the actual function in the table mismatch. This also makes the `wasm-ctor-eval` not evaluate `call_indirect` in case the signatures mismatch. Before we only compared the arguments' signature and the function signature, which was sufficient before we had subtypes, but now the signature in `call_indirect` and that of the actual function can be different even if the argument's signature is OK.
* Optimize passive segments in memory-packing (#2426)Thomas Lively2020-01-151-9/+0
| | | | | | | | | When memory is packed and there are passive segments, bulk memory operations that reference those segments by index need to be updated to reflect the new indices and possibly split into multiple instructions that reference multiple split segments. For some bulk-memory operations, it is necessary to introduce new globals to explicitly track the drop state of the original segments, but this PR is careful to only add globals where necessary.
* wasm2js: Do not convert x >>> 0 | 0 to x >>> 0 (#2581)Alon Zakai2020-01-101-2/+10
| | | | | | | | | | | | isBinary was used where we should only accept a signed binary, as removing the | 0 from an unsigned value may be incorrect. This does regress a few small things (as can be seen in the diff). If it's important we can add more sophisticated optimizations here, perhaps like an assumption that the signedness of a local never matters. Fixes emscripten-core/emscripten#10173
* Remove implicit conversion operators from Type (#2577)Thomas Lively2020-01-084-20/+20
| | | | | | | | | | * Remove implicit conversion operators from Type Now types must be explicitly converted to uint32_t with Type::getID or to ValueType with Type::getVT. This fixes #2572 for switches that use Type::getVT. * getVT => getSingle
* [NFC] Enforce use of `Type::` on type names (#2434)Thomas Lively2020-01-076-397/+404
|
* [NFC] Clean up unnecessary `template`s in calls 🧹🧹🧹 (#2394)Thomas Lively2020-01-071-3/+3
|
* Use FeatureSet instead of FeatureSet::Feature(NFC) (#2562)Heejin Ahn2020-01-021-22/+20
| | | | | This uses `FeatureSet` in place of `FeatureSet::Feature` when possible, making it possible for functions take a set of multiple features as one argument.
* Add support for reference types proposal (#2451)Heejin Ahn2019-12-305-84/+219
| | | | | | | | | | | | This adds support for the reference type proposal. This includes support for all reference types (`anyref`, `funcref`(=`anyfunc`), and `nullref`) and four new instructions: `ref.null`, `ref.is_null`, `ref.func`, and new typed `select`. This also adds subtype relationship support between reference types. This does not include table instructions yet. This also does not include wasm2js support. Fixes #2444 and fixes #2447.
* Strip DWARF in finalize, to avoid keeping it around til later unnecessarily ↵Alon Zakai2019-12-201-0/+8
| | | | | | | | (#2544) Without this, the first wasm-opt invocation will remove it. But it can be very large, and we will soon start to automatically do updating on it when it exists, so avoid the work if we aren't actually building a final output with dwarf.
* Binary format code section offset tracking (#2515)Alon Zakai2019-12-193-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optionally track the binary format code section offsets, that is, when loading a binary, remember where each IR node was read from. This is necessary for DWARF debug info, as these are the offsets DWARF refers to. (Note that eventually we may want to do something else, like first read the DWARF and only then add debug info annotations into the IR in a more LLVM-like manner, but this is more straightforward and should be enough to update debug lines and ranges). This tracking adds noticeable overhead - every single IR node adds an entry in a map - so avoid it unless actually necessary. Specifically, if the user passes in -g and there are actually DWARF sections in the binary, and we are not about to remove those sections, then we need it. Print binary format code section offsets in text, when printing with -g. This will help debug and test dwarf support. It looks like ;; code offset: 0x7 as an annotation right before each node. Also add support for -g in wasm-opt tests (unlike a pass, it has just one - as a prefix). Helps #2400
* Support stack overflow checks in standalone mode (#2525)Alon Zakai2019-12-121-0/+2
| | | | | | | | | In normal mode we call a JS import, but we can't import from JS in standalone mode. Instead, just trap in that case with an unreachable. (The error reporting is not as good in this case, but at least it catches all errors and halts, and the emitted wasm is valid for standalone mode.) Helps emscripten-core/emscripten#10019
* Make local.tee's type its local's type (#2511)Heejin Ahn2019-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the current spec, `local.tee`'s return type should be the same as its local's type. (Discussions on whether we should change this rule is going on in WebAssembly/reference-types#55, but here I will assume this spec does not change. If this changes, we should change many parts of Binaryen transformation anyway...) But currently in Binaryen `local.tee`'s type is computed from its value's type. This didn't make any difference in the MVP, but after we have subtype relationship in #2451, this can become a problem. For example: ``` (func $test (result funcref) (local $0 anyref) (local.tee $0 (ref.func $test) ) ) ``` This shouldn't validate in the spec, but this will pass Binaryen validation with the current `local.tee` implementation. This makes `local.tee`'s type computed from the local's type, and makes `LocalSet::makeTee` get a type parameter, to which we should pass the its corresponding local's type. We don't embed the local type in the class `LocalSet` because it may increase memory size. This also fixes the type of `local.get` to be the local type where `local.get` and `local.set` pair is created from `local.tee`.
* Remove FunctionType (#2510)Thomas Lively2019-12-116-47/+39
| | | | | | | | | | | | | | | | | Function signatures were previously redundantly stored on Function objects as well as on FunctionType objects. These two signature representations had to always be kept in sync, which was error-prone and needlessly complex. This PR takes advantage of the new ability of Type to represent multiple value types by consolidating function signatures as a pair of Types (params and results) stored on the Function object. Since there are no longer module-global named function types, significant changes had to be made to the printing and emitting of function types, as well as their parsing and manipulation in various passes. The C and JS APIs and their tests also had to be updated to remove named function types.
* Use wat over wast for text format filenames (#2518)Sam Clegg2019-12-083-5/+5
|