summaryrefslogtreecommitdiff
path: root/src/tools
Commit message (Collapse)AuthorAgeFilesLines
* Remove used wasm-emscripten-finalize option `--initial-stack-pointer` (#4490)Sam Clegg2022-02-011-8/+0
|
* Interpreter: Remove GlobalManager (#4486)Alon Zakai2022-01-311-84/+21
| | | | | | | | | | | | | | | | | | | | | | | | | GlobalManager is another class that added complexity in the interpreter logic, and did not help. In fact it hurts extensibility, as when one wants to extend the interpreter one has another class to customize, and it is templated on the main runner, so again as #4479 we end up with annoying template cycles. This simply removes that class. That makes the interpreter code strictly simpler. Applying that change to wasm-ctor-eval also ends up fixing a pre-existing bug, so this PR gets testing through that. The ctor-eval issue was that we did not extend the GlobalManager properly in the past: we checked for accesses on imported globals there, but not in the main class, i.e., not on global.get operations. Needing to do things in two places is an example of the previous complexity. The fix is simply to implement visitGlobalGet in one place, and remove all the GlobalManager logic added in ctor-eval, which then gets a lot simpler as well. The new imported-global-2.wast checks for that bug (a global.get of an import should stop us from evalling). Existing tests cover the other cases, like it being ok to read a non-imported global, etc. The existing test indirect-call3.wast required a slight change: There was a global.get of an imported global, which was ignored in the place it happened (an init of an elem segment); the new code checks all global.gets, so it now catches that.
* [NFC] Refactor ModuleInstanceBase+RuntimeExpressionRunner into a single ↵Alon Zakai2022-01-283-32/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | class (#4479) As recently discussed, the interpreter code is way too complex. Trying to add ctor-eval stuff I need, I got stuck and ended up spending some time to get rid of some of the complexity. We had a ModuleInstanceBase class which was basically an instance of a module, that is, an execution of it. And internally we have RuntimeExpressionRunner which is a runner that integrates with the ModuleInstanceBase - basically, it uses the runtime info to execute code. For example, the MIB has globals info, and the RER would read it from there. But these two classes are really just one functionality - an execution of a module. We get rid of some complexity by removing the separation between them, ending up with a class that can run a module. One set of problems we avoid is that we can now extend the single class in a simple way. Before, we would need to extend both - and inform each other of those changes. That gets "fun" with CRTP which we use everywhere. In other words, each of the two classes depended on the other / would need to be templated on the other. Specifically, MIB.callFunction would need to be given the RER to run with, and so that would need to be templated on it. This ends up leading to a bunch more templating all around - all complexity that we just don't need. See the simplification to the wasm-ctor-eval for some of that (and even worse complexity would have been needed without this PR in the next steps for that tool to eval GC stuff). The final single class is now called ModuleRunner. Also fixes a pre-existing issue uncovered by this PR. We had the delegate target on the runner, but it should be tied to a function scope. This happened to not be a problem if one always created a new runner for each scope, but this PR makes the runner longer-lived, so the stale data ended up mattering. The PR moves that data to the proper place. Note: Diff without whitespace is far, far smaller.
* Fuzzer: Fix a missing return of a trap (#4485)Alon Zakai2022-01-281-0/+1
| | | | | We emitted the right text to stdout to indicate a trap in one code path, but did not return a Trap from the function. As a result, we'd continue and hit the assert on the next line.
* wasm-emscripten-finalize: Remove legacy --new-pic-abi option (#4483)Sam Clegg2022-01-271-6/+0
|
* Make `TypeBuilder::build()` fallible (#4474)Thomas Lively2022-01-251-1/+6
| | | | | | | | | | | It is possible for type building to fail, for example if the declared nominal supertypes form a cycle or are structurally invalid. Previously we would report a fatal error and kill the program from inside `TypeBuilder::build()` in these situations, but this handles errors at the wrong layer of the code base and is inconvenient for testing the error cases. In preparation for testing the new error cases introduced by isorecursive typing, make type building fallible and add new tests for existing error cases. Also fix supertype cycle detection, which it turns out did not work correctly.
* Introduce gtest (#4466)Thomas Lively2022-01-201-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | Add gtest as a git submodule in third_party and integrate it into the build the same way WABT does. Adds a new executable, `binaryen-unittests`, to execute `gtest_main`. As a nontrivial example test, port one of the `TypeBuilder` tests from example/ to gtest/. Using gtest has a number of advantages over the current example tests: - Tests are compiled and linked at build time rather than runtime, surfacing errors earlier and speeding up test execution. - Tests are all built into a single binary, reducing overall link time and further reducing test overhead. - Tests are built from the same CMake project as the rest of Binaryen, so compiler settings (e.g. sanitizers) are applied uniformly rather than having to be separately set via the COMPILER_FLAGS environment variable. - Using the industry-standard gtest rather than our own script reduces our maintenance burden. Using gtest will lower the barrier to writing C++ tests and will hopefully lead to us having more proper unit tests.
* Add a `--hybrid` type system option (#4460)Thomas Lively2022-01-191-0/+9
| | | | | Eventually this will enable the isorecursive hybrid type system described in https://github.com/WebAssembly/gc/pull/243, but for now it just throws a fatal error if used.
* Add --no-emit-metadata option to wasm-emscripten-finalize (#4450)Sam Clegg2022-01-191-3/+14
| | | | | | This is useful for the case where we might want to finalize without extracting metadata. See: https://github.com/emscripten-core/emscripten/pull/15918
* LiteralList => Literals (#4451)Alon Zakai2022-01-133-6/+6
| | | | | | | LiteralList overlaps with Literals, but is less efficient as it is not a SmallVector. Add reserve/capacity methods to SmallVector which are now necessary to compile.
* [ctor-eval] Eval functions with params if ignoring external input (#4446)Alon Zakai2022-01-121-6/+24
| | | | | | | | | | | | | | | | | When ignoring external input, assume params have a value of 0. This makes it possible to eval main(argc, argv) if one is careful and does not actually use those values. This is basically a workaround for main always receiving argc/argv, even if the C code has no args (in that case the compiler emits __original_main for the user's main, and wraps it with a main that adds the args, hence the problem). This is similar to the existing support for handling wasi_args_get when ignoring external input, although it just sets values of zeros for the params. Perhaps it could check for main() specifically and return 1 for argc and a proper buffer for argv somehow, but I think if a program wants to use --ignore-external-input it can avoid actually reading argc/argv.
* [ctor-eval] Followup refactoring to use std::optional for EvalCtorOutcome ↵Alon Zakai2022-01-121-21/+16
| | | | (#4448)
* [ctor-eval] Eval functions with a return value (#4443)Alon Zakai2022-01-121-24/+49
| | | This is necessary for e.g. main() which returns an i32.
* [ctor-eval] Stop if there are any memory.init instructions (#4442)Alon Zakai2022-01-111-18/+25
| | | | | | | | This tool depends (atm) on flattening memory segments. That is not compatible with memory.init which cares about segment identities. This changes flatten() only by adding the check for MemoryInit. The rest is unchanged, although I saw the other two params are not needed and I removed them while I was there.
* [ctor-eval] Add an option to keep some exports (#4441)Alon Zakai2022-01-111-15/+39
| | | | | | | | | | | | | | | | | | | | | | By default wasm-ctor-eval removes exports that it manages to completely eval (if it just partially evals then the export remains, but points to a function with partially-evalled contents). However, in some cases we do want to keep the export around even so, for example during fuzzing (as the fuzzer wants to call the same exports before and after wasm-ctor-eval runs) and also if there is an ABI we need to preserve (like if we manage to eval all of main()), or if the function returns a value (which we don't support yet, but this is a PR to prepare for that). Specifically, there is now a new option: --kept-exports foo,bar That is a list of exports to keep around. Note that when we keep around an export after evalling the ctor we make the export point to a new function. That new function just contains a nop, so that nothing happens when it is called. But the original function is kept around as it may have other callers, who we do not want to modify.
* [ctor-eval] Fix evalling of overlapping table segments (#4440)Alon Zakai2022-01-111-19/+25
|
* [ctor-eval] Partial evaluation (#4438)Alon Zakai2022-01-111-23/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This lets us eval part of a function but not all, which is necessary to handle real-world things like __wasm_call_ctors in LLVM output, as that is the single ctor that is exported and it has calls to the actual ctors. To do so, we look for a toplevel block and execute its items one by one, in a FunctionScope. If we stop in the middle, then we are performing a partial eval. In that case, we only remove the parts of the function that we removed, and we also serialize the locals whose values we read from the FunctionScope. For example, consider this: function foo() { return 10; } function __wasm_call_ctors() { var x; x = foo(); x++; // We stop evalling here. import1(); import2(x); } We can eval x = foo() and x++, but we must stop evalling when we reach the first of those imports. The partially-evalled function then looks like this: function __wasm_call_ctors() { var x; x = 11; import1(); import2(x); } That is, we evalled two lines of executing code and simply removed them, and then we wrote out the value of the local at that point, and then the rest of the code in the function is as it used to be.
* [ctor-eval] Switch logging from stderr to stdout (#4432)Alon Zakai2022-01-071-7/+7
| | | | | This logging is central to what this tool does, and not optional, so stdout makes more sense I think. Also, as I'm re-integrating this on the emscripten side, this makes it simpler.
* [ctor-eval] Eval and store changes to globals (#4430)Alon Zakai2022-01-071-16/+11
| | | | | | | | | | This is necessary for being able to optimize real-world code, as it lets us use the stack pointer for example. With this PR we allow changes to globals, and we simply store the final state of the global in the global at the end. Basically the same as we do for memory, but for globals. Remove a test that now fails ("imported2"). Replace it with a nicer test of saving the values of globals. Also add a test for an imported global, which we do not allow (we never did, but I don't see a test for it).
* [ctor-eval] Add --ignore-external-input option (#4428)Alon Zakai2022-01-061-7/+73
| | | | | | | | | | | | This is meant to address one of the main limitations of wasm-ctor-eval in emscripten atm, that libc++ global ctors will read env vars, which means they call an import, which stops us from evalling, emscripten-core/emscripten#15403 (comment) To handle that, this adds an option to ignore external input. When set, we can assume that no env vars will be read, no reading from stdin, no arguments to main(), etc. Perhaps these could each be separate options, but I think keeping it simple for now might be good enough.
* [ctor-eval] Refactor an applyToModule() method instead of hacks [NFC] (#4425)Alon Zakai2022-01-061-19/+38
| | | | | | | Previously this would hackishly apply all execution changes to the memory all the time, and then "undo" it by saving the state before and copying that in. Instead, this PR makes execution write into a side buffer, and now there is a clear method for when we want to actually apply the results to the module.
* [ctor-eval] Remove stack hacks (#4429)Alon Zakai2022-01-061-55/+2
| | | | | | | | | Remove some hackish code for fastcomp's stack handling. The stack pointer arrives in an imported global there. Upstream does not do this, so this code is completely unneeded these days (and, frankly, kind of scary as I read it now... it modeled the stack as separate memory from the heap...). Remove the tests for this as well. I verified that there was nothing else in those tests that we need to keep.
* Add categories to --help text (#4421)Alon Zakai2022-01-0514-1/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The general shape of the --help output is now: ======================== wasm-foo Does the foo operation ======================== wasm-foo opts: -------------- --foo-bar .. Tool opts: ---------- .. The options are now in categories, with the more specific ones - most likely to be wanted by the user - first. I think this makes the list a lot less confusing. In particular, in wasm-opt all the opt passes are now in their own category. Also add a script to make it easy to update the help tests.
* [EH] Enable fuzzer with initial contents (#4409)Heejin Ahn2022-01-042-3/+12
| | | | | | | | | This enables fuzzing EH with initial contents. fuzzing.cpp/h does not yet support generation of EH instructions, but with this we can still fuzz EH based on initial contents. The fuzzer ran successfully for more than 1,900,000 iterations, with my local modification that always enables EH and lets the fuzzer select only EH tests for its initial contents.
* Compare traps in ExecutionResults (#4405)Heejin Ahn2021-12-292-23/+39
| | | | | | | | We used to only compare return values, and in #4369 we started comparing whether an uncaught exception was thrown. This also adds whether a trap occurred to `ExecutionResults`. So in `--fuzz-exec`, if a program with a trap loses the trap or vice versa, it will error out saying the result has changed, unless either of `--ignore-implicit-traps` or `--trans-never-happen` is set.
* [Fuzzer] Allow empty data in --translate-to-fuzz (#4406)Heejin Ahn2021-12-281-2/+2
| | | | | | | When a parameter and a member variable have the same name within a constructor, to access (and change) the member variable, we need to either use `this->` or change the name of the parameter. The current code ended up changing the parameter and didn't affect the status of the member variable, which remained empty.
* Validate LUBs in the type fuzzer (#4396)Thomas Lively2021-12-151-0/+53
| | | | | Update the LUB calculation code to use std::optional rather than out params and validate LUBs in the fuzzer to ensure that the change is NFC as intended. Also add HeapType::getLeastUpperBound to the public API as a convenience.
* [EH] Make interpreter handle uncaught exceptions (#4369)Heejin Ahn2021-12-061-24/+28
| | | | | | | | | When a wasm exception is thrown and uncaught in the interpreter, it caused the whole interpreter to crash, rather than gracefully reporting it. This fixes the problem, and also compares whether an uncaught exception happened when comparing the results before and after optimizations in `--fuzz-exec`. To do that, when `--fuzz-exec` is given, we now compare results even when the function does not have return values. Logs for some existing test have changed because of this.
* Modernize code to C++17 (#3104)Max Graey2021-11-223-14/+7
|
* Change from storing Signature to HeapType on CallIndirect (#4352)Thomas Lively2021-11-222-3/+3
| | | | | | | | | | | | With nominal function types, this change makes it so that we preserve the identity of the function type used with call_indirect instructions rather than recreating a function heap type, which may or may not be the same as the originally parsed heap type, from the function signature during module writing. This will simplify the type system implementation by removing the need to store a "canonical" nominal heap type for each unique signature. We previously depended on those canonical types to avoid creating multiple duplicate function types during module writing, but now we aren't creating any new function types at all.
* Add fixup function for nested pops in catch (#4348)Heejin Ahn2021-11-221-22/+23
| | | | | | | | | | | | | | | | | | | | | | | | | This adds `EHUtils::handleBlockNestedPops`, which can be called at the end of passes that has a possibility to put `pop`s inside `block`s. This method assumes there exists a `pop` in a first-descendant line, even though it can be nested within a block. This allows a `pop` to be nested within a `block` or a `try`, but not a `loop`, since that means the `pop` can run multile times. In case of `if`, `pop` can exist only in its condition; if a `pop` is in its true or false body, that's not in the first-descendant line. This can be useful when optimization passes create blocks to do transformations. Wrapping expressions wiith a block does not change semantics most of the time, but if pops happen to be inside a block generated by those passes, they can result in invalid binaries. To test this, this adds `passes/test_passes.cpp`, which is intended to contain multiple test passes that test a single (or more) utility functions separately. Without this kind of pass, it is hard to test various cases in which nested `pop`s can be generated in existing passes. This PR also adds `PassRegistry::registerTestPass`, which registers a pass that's intended only for internal testing and does not show up in `wasm-opt --help`. Fixes #4237.
* Check for correct subtyping in the type fuzzer (#4350)Thomas Lively2021-11-203-90/+120
| | | | | Check that types that were meant to have a subtype relationship actually do. To expose the intended subtyping to the fuzzer, expose `subtypeIndices` in the return value of the type generation function.
* Allow building basic HeapTypes in nominal mode (#4346)Thomas Lively2021-11-191-19/+11
| | | | | | | | | | | | | | | | As we work toward allowing nominal and structural types to coexist, any difference in how they can be built or used will be an inconvenient footgun that we will have to work around. In the spirit of reducing the differences between the type systems, allow TypeBuilder to construct basic HeapTypes in nominal mode just as it can in equirecursive mode. Although this change is a net increase in code complexity for not much benefit (wasm-opt never needs to build basic HeapTypes), it is also an incremental step toward getting rid of separate type system modes, so I expect it to simplify other PRs in the near future. This change also uncovered a bug in how the type fuzzer generated subtypes of basic HeapTypes. The generated subtypes did not necessarily have the intended `Kind`, which caused failures in nominal subtype validation in the fuzzer.
* Small cleanups in type fuzzer (#4337)Thomas Lively2021-11-172-20/+14
| | | | | | | - Do not require defaultable types in function returns - Increase likelihood of `none` function return types - Correctly generate subtypes of basic types - Actually check output in tests - Print to cout instead of cerr
* Add a fuzzer specifically for types (#4328)Thomas Lively2021-11-159-120/+851
| | | | | | | | | | | | | | | Add a new fuzzer binary that repeatedly generates random types to find bugs in the type system implementation. Each iteration creates some number of root types followed by some number of subtypes thereof. Each built type can contain arbitrary references to other built types, regardless of their order of construction. Right now the fuzzer only finds fatal errors in type building (and in its own implementation), but it is meant to be extended to check other properties in the future, such as that LUB calculations work as expected. The logic for creating types is also intended to be integrated into the main fuzzer in a follow-on PR so that the main fuzzer can fuzz with arbitrarily more interesting GC types.
* Fuzz more basic GC types (#4303)Thomas Lively2021-11-042-116/+248
| | | | | Generate both nullable and non-nullable references to basic HeapTypes and introduce `i31` and `data` HeapTypes. Generate subtypes rather than exact types for all concrete-typed children.
* [NFC] Factor fuzzer randomness into a separate utility (#4304)Thomas Lively2021-11-045-85/+164
| | | | In preparation for using it from a separate file specifically for generating random HeapTypes that has no need to depend on all of fuzzing.h.
* [NFC] Create a .cpp file for fuzzer implementation (#4279)Thomas Lively2021-10-263-3081/+3170
| | | | | | Having a monolithic header file containing all the implementation meant there was no good way to split up the code or introduce new files. The new implementation file and source directory will make it much easier to add new fuzzing functionality in new files.
* Reducer: Apply --debug to all commands (#4275)Alon Zakai2021-10-251-3/+4
| | | | | | Do so by applying --debug to extraFlags right at the start. That global is used everywhere already. In particular, this PR removes manually adding -g in the first diff chunk here, and you can see extraFlags appears there already on the previous line.
* Add table.grow operation (#4245)Max Graey2021-10-181-11/+13
|
* Add a --structural flag (#4252)Thomas Lively2021-10-161-2/+9
| | | | | | | | | Just as the --nominal flag forces all types to be parsed as nominal, the --structural flag forces all types to be parsed as equirecursive. This is the current default behavior, but a future PR will change the default to parse types as either structural or nominal according to their syntax or encoding. This new flag will then be necessary to get the current behavior. Also take this opportunity to deduplicate more flags in the help tests.
* [wasm-metadce] Add support for tags (#4250)Heejin Ahn2021-10-141-0/+17
| | | | | | This adds support for tag-using instructions (`throw` and `catch`) to wasm-metadce. We had to use a hacky workaround in emscripten-core/emscripten#15266 because of the lack of this support; after this lands we can remove it.
* [wasm-metadce] Don't add null names to roots (#4246)Heejin Ahn2021-10-141-7/+5
| | | | | | | | | Not sure why the current code tries to add the name even when it is null, but it causes `dump()` to behave strangely and pollute stdout when it tries to print `root.str`. Also this changes code printing `Name.str` to printing just `Name`; when `Name.str` is null, it prints `(null Name)` instead of polluting stdout, and it is the recommended way of printing `Name` anyway.
* Add table.size operation (#4224)Max Graey2021-10-081-0/+4
|
* Add table.set operation (#4215)Max Graey2021-10-071-0/+4
|
* Implement table.get (#4195)Alon Zakai2021-09-301-0/+4
| | | | Adds the part of the spec test suite that this passes (without table.set we can't do it all).
* Disable partial inlining by default and add a flag for it. (#4191)Alon Zakai2021-09-271-0/+10
| | | | | Locally I saw a 10% speedup on j2cl but reports of regressions have arrived, so let's disable it for now pending investigation. The option added here should make it easy to experiment.
* [wasm-split] Disallow mixing --profile, --keep-funcs, and --split-funcs (#4187)Thomas Lively2021-09-242-41/+40
| | | | | | | | | | | | | Previously the set of functions to keep was initially empty, then the profile added new functions to keep, then the --keep-funcs functions were added, then the --split-funcs functions were removed. This method of composing these different options was arbitrary and not necessarily intuitive, and it prevented reasonable workflows from working. For example, providing only a --split-funcs list would result in all functions being split out not matter which functions were listed. To make the behavior of these options, and --split-funcs in particular, more intuitive, disallow mixing them and when --split-funcs is used, split out only the listed functions.
* Add feature flag for relaxed-simd (#4183)Ng Zhi An2021-09-231-0/+1
|
* Do not use a library for wasm-split files (#4132)Thomas Lively2021-09-081-4/+2
|