summaryrefslogtreecommitdiff
path: root/src/tools
Commit message (Collapse)AuthorAgeFilesLines
* Make validation of stale types stricter (#7097)Thomas Lively2024-11-212-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We previously allowed valid expressions to have stale types as long as those stale types were supertypes of the most precise possible types for the expressions. Allowing stale types like this could mask bugs where we failed to propagate precise type information, though. Make validation stricter by requiring all expressions except for control flow structures to have the most precise possible types. Control flow structures are exempt because many passes that can refine types wrap the refined expressions in blocks with the old type to avoid the need for refinalization. This pattern would be broken and we would need to refinalize more frequently without this exception for control flow structures. Now that all non-control flow expressions must have precise types, remove functionality relating to building select instructions with non-precise types. Since finalization of selects now always calculates a LUB rather than using a provided type, remove the type parameter from BinaryenSelect in the C and JS APIs. Now that stale types are no longer valid, fix a bug in TypeSSA where it failed to refinalize module-level code. This bug previously would not have caused problems on its own, but the stale types could cause problems for later runs of Unsubtyping. Now the stale types would cause TypeSSA output to fail validation. Also fix a bug where Builder::replaceWithIdenticalType was in fact replacing with refined types. Fixes #7087.
* Fuzzer: Legalize and prune the JS interface in pickPasses (#7092)Alon Zakai2024-11-201-0/+7
| | | | Also add a test that the ClusterFuzz run.py does not warn, which was helpful when debugging this.
* Improve fuzzing of both closed and open world styles of modules (#7090)Alon Zakai2024-11-192-21/+17
| | | | | | | | | | Before, we would simply not export a function that had an e.g. anyref param. As a result, the modules were effectively "closed", which was good for testing full closed-world mode, but not for testing degrees of open world. To improve that, this PR allows the fuzzer to export such functions, and an "enclose world" pass is added that "closes" the wasm (makes it more compatible with closed-world) that is run 50% of the time, giving us coverage of both styles.
* Fuzzing: ClusterFuzz integration (#7079)Alon Zakai2024-11-192-14/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main addition here is a bundle_clusterfuzz.py script which will package up the exact files that should be uploaded to ClusterFuzz. It also documents the process and bundling and testing. You can do bundle.py OUTPUT_FILE.tgz That bundles wasm-opt from ./bin., which is enough for local testing. For actually uploading to ClusterFuzz, we need a portable build, and @dschuff had the idea to reuse the emsdk build, which works nicely. Doing bundle.py OUTPUT_FILE.tgz --build-dir=/path/to/emsdk/upstream/ will bundle wasm-opt (+libs) from the emsdk. I verified that those builds work on ClusterFuzz. I added several forms of testing here. First, our main fuzzer fuzz_opt.py now has a ClusterFuzz testcase handler, which simulates a ClusterFuzz environment. Second, there are smoke tests that run in the unit test suite, and can also be run separately: python -m unittest test/unit/test_cluster_fuzz.py Those unit tests can also run on a given bundle, e.g. one created from an emsdk build, for testing right before upload: BINARYEN_CLUSTER_FUZZ_BUNDLE=/path/to/bundle.tgz python -m unittest test/unit/test_cluster_fuzz.py A third piece of testing is to add a --fuzz-passes test. That is a mode for -ttf (translate random data into a valid wasm fuzz testcase) that uses random data to pick and run a set of passes, to further shape the wasm. (--fuzz-passes had no previous testing, and this PR fixes it and tidies it up a little, adding some newer passes too). Otherwise this PR includes the key run.py script that is bundled and then executed by ClusterFuzz, basically a python script that runs wasm-opt -ttf [..] to generate testcases, sets up their JS, and emits them. fuzz_shell.js, which is the JS to execute testcases, will now check if it is provided binary data of a wasm file. If so, it does not read a wasm file from argv[1]. (This is needed because ClusterFuzz expects a single file for the testcase, so we make a JS file with bundled wasm inside it.)
* [wasm64] Fuzzer: Fix type of unimported offsets (#7071)Alon Zakai2024-11-111-2/+2
| | | | | | When the fuzzer sees an imported segment, it makes it non-imported (because imported ones would trap when we tried to run them: we don't have the normal runtime they expect). We had hardcoded i32 offets there, which need to be generalized.
* [EH] Fuzz calls from JS by calling wasm exports, sometimes catching (#7067)Alon Zakai2024-11-083-5/+136
| | | | | | | | | | | | | | | | This adds two new imports to fuzzer modules: * call-export, which gets an export index and calls it. * call-export-catch, which does the call in a try-catch, swallowing any error, and returning 1 if it saw an error. The former gives us calls back into the wasm, possibly making various trips between wasm and JS in interesting ways. The latter adds a try-catch which helps fuzz wasm EH. We do these calls using a wasm export index, i.e., the index in the list of exports. This is simple, but it does have the downside that it makes executing the wasm sensitive to changes in exports (e.g. wasm-merge adds more), which requires some handling in the fuzzer.
* Rename indexType -> addressType. NFC (#7060)Sam Clegg2024-11-073-17/+17
| | | See https://github.com/WebAssembly/memory64/pull/92
* [wasm64] Fix copying of 64-bit tables, and fuzz them (#7065)Alon Zakai2024-11-071-2/+20
| | | | `ModuleUtils::copyTable` was not copying the `indexType` property.
* [wasm64] Fuzz wasm64 memories (#7064)Alon Zakai2024-11-072-8/+27
| | | | | | | * Remove the code that prevented fuzzing wasm64 test files. * Ignore a run that hits the V8 implementation limit on memory size. * Disable wasm64 fuzzing in wasm2js (like almost all post-MVP features). * Add fuzzer logic to emit a 64-bit memory sometimes. * Fix various places in the fuzzer that assumed 32-bit indexes
* [wasm64] Make interpreter table methods operate on Address, not Index (#7062)Alon Zakai2024-11-072-5/+6
| | | This allows 64-bit bounds checking to work properly.
* [wasm64] Fix wasm-ctor-eval + utils on 64-bit indexes for memory64 (#7059)Alon Zakai2024-11-061-3/+5
| | | | Some places assumed a 32-bit index.
* [NFC] Use RAII to manage call depth tracking in the interpreter (#7049)Alon Zakai2024-11-011-1/+1
| | | | | | | The old code manually managed it for no good reason that I can see. After this, there is no difference between callFunction and callFunctionInternal, so fold them together.
* Fuzz the Table from JS (#7042)Alon Zakai2024-10-313-6/+117
| | | | | Continues the work from #7027 which added throwing from JS, this adds table get/set operations from JS, to further increase our coverage of Wasm/JS interactions (the table can be used from both sides).
* Don't strip target features in wasm-emscripten-finalize (#7043)Derek Schuff2024-10-301-2/+0
| | | | | This makes the behavior consistent with emcc builds where we don't run finalization, and potentially makes testing and debugging easier. Emscripten still strips the target features section when optimizing.
* [EH] Fuzz throws from JS (#7027)Alon Zakai2024-10-233-38/+85
| | | | | | | | | | | We already generated (throw ..) instructions in wasm, but it makes sense to model throws from outside as well, as they cross the module boundary. This adds a new fuzzer import to the generated modules, "throw", that just does a throw from JS etc. Also be more precise about handling fuzzing-support imports in fuzz-exec: we now check that logging functions start with "log*" and error otherwise (this check is now needed given we have "throw", which is not logging). Also fix a minor issue with name conflicts for logging functions by using getValidFunctionName for them, both for logging and for throw.
* [Wasm GC] Fuzz BrOn (#7006)Alon Zakai2024-10-162-6/+119
|
* Fuzzer: Generate TryTables (#6987)Alon Zakai2024-10-072-0/+69
| | | | Also make Try/TryTables with type none, and not just concrete types as before.
* [FP16] Implement conversion operations. (#6974)Brendan Dahl2024-09-261-1/+5
| | | | | | | | | | Note: FP16 is a little different from F32/F64 since it can't represent the full 2^16 integer range. 65504 is the max whole integer. This leads to some slightly strange behavior when converting integers greater than 65504 since they become infinity. Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
* [wasm-split] Configure split functions rather than kept functions (#6949)Thomas Lively2024-09-171-8/+3
| | | | | | | | The configuration for the module splitting utility previous took a set of functions to keep in the primary module. Change it to take a list of functions to split into the secondary module instead. This improves the code quality in multi-split mode because it keeps stub functions generated by previous splits from being moved into secondary modules during later splits.
* [wasm-split] Simplify handling of --keep-funcs and --split-funcs (#6948)Thomas Lively2024-09-173-67/+67
| | | | | | | | | | | | Maintain the invariant that every defined functions belongs to either the set of kept functions or the set of split functions. Functions are kept by default except when --keep-funcs is specified without --split-funcs on the command line. This is mostly NFC except that it changes the default behavior when no arguments are specified on the command line to keep all functions. This will simplify a follow-on PR that switches from passing the kept functions to the module splitting utility to passing the split functions.
* [wasm-split] Run RemoveUnusedElements on secondary modules (#6945)Thomas Lively2024-09-171-5/+6
| | | | | | | | | Rather than analyze what module elements from the primary module a secondary module will need, the splitting logic conservatively imports all module elements from the primary module into the secondary module. Run RemoveUnusedElements on the secondary module to remove any of these imports that happen to be unnecessary. Leave a TODO mentioning the possibility of being more selective about which module elements get exported to reduce code size in the primary module, too.
* [wasm-split] Add a multi-split mode (#6943)Thomas Lively2024-09-163-3/+121
| | | | | | | Add a mode that splits a module into arbitrarily many parts based on a simple manifest file. This is currently implemented by splitting out one module at a time in a loop, but this could change in the future if splitting out all the modules at once would improve the quality of the output.
* [wasm-split] Add an option to skip importing placeholders (#6942)Thomas Lively2024-09-163-0/+11
| | | | | | | | | | | | | | Wasm-split generally assumes that calls to secondary functions made before the secondary module has been loaded and instantiated should go to imported placeholder functions that can be responsible for loading the secondary module and forwarding the call to the loaded function. That scheme makes the loading entirely transparent from the application's point of view, which is not always a good thing. Other schemes would make it impossible for a secondary function to be called before the secondary module has been explicitly loaded, in which case the placeholder functions would never be called. To improve code size and simplify instantiation under these schemes, add a new `--no-placeholders` option that skips adding imported placeholder functions.
* Replace the old topological sort everywhere (#6902)Thomas Lively2024-09-101-27/+9
| | | | | | | | | To avoid having two separate topological sort utilities in the code base, replace remaining uses of the old DFS-based, CRTP topological sort with the newer Kahn's algorithm implementation. This would be NFC, except that the new topological sort produces a different order than the old topological sort, so the output of some passes is reordered.
* Add a --preserve-type-order option (#6916)Thomas Lively2024-09-1011-14/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike other module elements, types are not stored on the `Module`. Instead, they are collected by traversing the IR before printing and binary writing. The code that collects the types tries to optimize the order of rec groups based on the number of times each type is used. As a result, the output order of types generally has no relation to the input order of types. In addition, most type optimizations rewrite the types into a single large rec group, and the order of types in that group is essentially arbitrary. Changes to the code for counting type uses, sorting types, or sorting rec groups can yield very large changes in the output order of types, producing test diffs that are hard to review and potentially harming the readability of tests by moving output types away from the corresponding input types. To help make test output more stable and readable, introduce a tool option that causes the order of output types to match the order of input types as closely as possible. It is implemented by having the parsers record the indices of the input types on the `Module` just like they already record the type names. The `GlobalTypeRewriter` infrastructure used by type optimizations associates the new types with the old indices just like it already does for names and also respects the input order when rewriting types into a large recursion group. By default, wasm-opt and other tools clear the recorded type indices after parsing the module, so their default behavior is not modified by this change. Follow-on PRs will use the new flag in more tests, which will generate large diffs but leave the tests in stable, more readable states that will no longer change due to other changes to the optimizing type sorting logic.
* [NFC] Rename the old topological sort utility (#6914)Thomas Lively2024-09-061-2/+2
| | | | This will allow both the old and new topological sort utilities to be included into the same .cpp file while we phase out the old utility.
* [NFC] Refactor LocalGraph's core getSets API (#6877)Alon Zakai2024-08-281-2/+2
| | | | | | | | | | | | | | Before we just had a map that people would access with localGraph.getSetses[get], while now it is a call localGraph.getSets(get), which more nicely hides the internal implementation details. Also rename getSetses => getSetsMap. This will allow a later PR to optimize the internals of this API. This is performance-neutral as far as I can measure. (We do replace a direct read from a data structure with a call, but the call is in a header and should always get inlined.)
* [FP16] Implement unary operations. (#6867)Brendan Dahl2024-08-271-25/+36
| | | | Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
* [FP16] Add a feature flag for FP16. (#6864)Brendan Dahl2024-08-222-117/+131
| | | Ensure the "fp16" feature is enabled for FP16 instructions.
* [FP16] Implement arithmetic operations. (#6855)Brendan Dahl2024-08-211-0/+6
| | | | Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
* Support `ref.extern n` in spec tests (#6858)Thomas Lively2024-08-211-1/+2
| | | | | | | | | | | | | | | | | Spec tests pass the value `ref.extern n`, where `n` is some integer, into exported functions that expect to receive externrefs and receive such values back out as return values. The payload serves to distinguish externrefs so the test can assert that the correct one was returned. Parse these values in wast scripts and represent them as externalized i31refs carrying the payload. We will need a different representation eventually, since some tests explicitly expect these externrefs to not be i31refs, but this suffices to get several new tests passing. To get the memory64 version of table_grow.wast passing, additionally fix the interpreter to handle growing 64-bit tables correctly. Delete the local versions of the upstream tests that can now be run successfully.
* Add the upstream spec testsuite as a submodule (#6853)Thomas Lively2024-08-201-0/+3
| | | | | | Run the upstream tests by default, except for a large list of them that do not successfully run. Remove the local version of those that do successfully run where the local version is entirely subsumed by the upstream version.
* [NFC] Use HeapType::getKind more broadly (#6846)Thomas Lively2024-08-193-117/+157
| | | | | | | | Replace code that checked `isStruct()`, `isArray()`, etc. in sequence with uses of `HeapType::getKind()` and switch statements. This will make it easier to find the code that needs updating if/when we add new heap type kinds in the future. It also makes it much easier to find code that already needs updating to handle continuation types by grepping for "TODO: cont".
* Fix direct comparisons with unshared basic heap types (#6845)Thomas Lively2024-08-161-3/+5
| | | | | Audit the remaining ocurrences of `== HeapType::` and fix those that did not handle shared types correctly. Add tests for some of the fixes; others are NFC but clarify the code.
* Implement table.init (#6827)Alon Zakai2024-08-161-9/+15
| | | | | Also use TableInit in the interpreter to initialize module's table state, which will now handle traps properly, fixing #6431
* [FP16] Implement relation operations. (#6825)Brendan Dahl2024-08-091-0/+7
| | | | Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
* [FP16] Implement lane access instructions. (#6821)Brendan Dahl2024-08-081-0/+1
| | | | Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
* Add a utility for comparing and hashing rec group shapes (#6808)Thomas Lively2024-08-071-0/+94
| | | | | | | | | | | This is very similar to the internal utilities for canonicalizing rec groups in the type system implementation, except that the new utility also supports ordered comparison of rec groups, and of course the new utility only uses the public type API. A follow-up PR will replace the internal implementation of rec group comparison and hashing in the type system with this one. Another follow-up PR will use this new utility in a type optimization.
* Restore isString type methods (#6815)Thomas Lively2024-08-063-8/+4
| | | | | | | | | PR ##6803 proposed removing Type::isString and HeapType::isString in favor of more explicit, verbose callsites. There was no consensus to make this change, but it was accidentally committed as part of #6804. Revert the accidental change, except for the useful, noncontroversial parts, such as fixing the `isString` implementation and a few other locations to correctly handle shared types.
* Fix sharedness bug in inhabitable type fuzzer (#6807)Thomas Lively2024-08-061-1/+2
| | | | | | The code for collecting inhabitable types incorrectly considered shared, non-nullable externrefs to be inhabitable, which disagreed with the code for rewriting types to be inhabitable, which was correct, causing the type fuzzer to report an error.
* [NFC] Add HeapType::getKind returning a new HeapTypeKind enum (#6804)Thomas Lively2024-08-063-4/+9
| | | | | | | | | | | | | | | | | The HeapType API has functions like `isBasic()`, `isStruct()`, `isSignature()`, etc. to test the classification of a heap type. Many users have to call these functions in sequence and handle all or most of the possible classifications. When we add a new kind of heap type, finding and updating all these sites is a manual and error-prone process. To make adding new heap type kinds easier, introduce a new API that returns an enum classifying the heap type. The enum can be used in switch statements and the compiler's exhaustiveness checker will flag use sites that need to be updated when we add a new kind of heap type. This commit uses the new enum internally in the type system, but follow-on commits will add new uses and convert uses of the existing APIs to use `getKind` instead.
* [wasm-reduce] Do not crash on non-func element segments (#6778)Thomas Lively2024-07-261-10/+5
| | | | Generalize the code for simplifying element segments to handle more than just null and funcref elements.
* [NFC] Add HeapType::isMaybeShared(BasicHeapType) utility (#6773)Thomas Lively2024-07-181-2/+1
| | | | | | | | | This abbreviates a common pattern where we first had to check whether a heap type was basic, then if it was, get its unshared version and compare it to some expected BasicHeapType. Suggested in https://github.com/WebAssembly/binaryen/pull/6771#discussion_r1683005495.
* [threads] Update the fuzzer for shared types (#6771)Thomas Lively2024-07-182-54/+90
| | | | | | | | Update the fuzzer to both handle shared types in initial contents and create and use new shared types without crashing or producing invalid modules. Since V8 does not have a complete implementation of shared-everything-threads yet, disable fuzzing V8 when shared-everything is enabled. To avoid losing too much coverage of V8, disable shared-everything in the fuzzer more frequently than other features.
* Make it possible to skip several passes (#6714)Jérôme Vouillon2024-07-171-1/+1
| | | --skip-pass can now be specified more than once on the commandline.
* Simplify fuzzer generation of function references (#6745)Thomas Lively2024-07-151-17/+11
| | | | | | | | | | | | When creating a reference to `func`, fix the probability of choosing to continue on to choose some function other than the last one rather than making it depend on the number of functions. Then, do not eagerly pick from the rest of the candidate functions. Instead, fall through to the more general logic that will already pick a random candidate function. Also move the logic for coming up with a concrete signature down to where it is needed. These simplifications will make it easier to update the code to handle shared types.
* Allow different arguments for multiple instances of a pass (#6687)Christian Speckner2024-07-153-10/+51
| | | | | | | | | | | | Each pass instance can now store an argument for it, which can be different. This may be a breaking change for the corner case of running a pass multiple times and setting the pass's argument multiple times as well (before, the last pass argument affected them all; now, it affects the last instance only). This only affects arguments with the name of a pass; others remain global, as before (and multiple passes can read them, in fact). See the CHANGELOG for details. Fixes #6646
* [StackIR] Allow StackIR to be disabled from the commandline (#6725)Alon Zakai2024-07-102-3/+18
| | | | | | | | | Normally we use it when optimizing (above a certain level). This lets the user prevent it from being used even then. Also add optimization options to wasm-metadce so that this is possible there as well and not just in wasm-opt (this also opens the door to running more passes in metadce, which may be useful later).
* Allow --keepfuncs and --splitfuncs to be use alongside a profile data (#6322)Benjamin Ling2024-07-102-25/+38
| | | | | | | | | There are times after collecting a profile, we wish to manually include specific functions into the primary module. It could be due to non-deterministic profiling or functions for error scenarios (e.g. _trap). This PR helps to unlock this workflow by honoring both the `--keep-funcs` flag as well as the `--profile` flag
* Rename external conversion instructions (#6716)Jérôme Vouillon2024-07-081-1/+1
| | | | | | | | | Rename instructions `extern.internalize` into `any.convert_extern` and `extern.externalize` into `extern.convert_any` to follow more closely the spec. This was changed in https://github.com/WebAssembly/gc/issues/432. The legacy name is still accepted in text inputs and in the C and JS APIs.