summaryrefslogtreecommitdiff
path: root/src/tools
Commit message (Collapse)AuthorAgeFilesLines
* Add CMake option to only build tools needed for Emscripten (#5319)Derek Schuff2022-12-021-3/+5
| | | This helps cut the size and build time of the emsdk package.
* Support `array` and `struct` types in the type fuzzer (#5308)Thomas Lively2022-12-021-40/+54
| | | | | | | Since `data` has been removed from the upstream proposal and `struct` has been added in its place, update the type fuzzer to be structured around `struct` and `array` (which it had not previously been updated to support) rather than `data`. A follow-on PR will make the broader change of removing `data` and adding `struct`.
* Use C++17's [[maybe_unused]]. NFC (#5309)Sam Clegg2022-12-023-7/+3
|
* [Wasm GC] Implement closed-world flag (#5303)Alon Zakai2022-11-301-2/+2
| | | | | | | | | | | | | With this change we default to an open world, that is, we do the safe thing by default: we no longer assume a closed world. Users that want a closed world must pass --closed-world. Atm we just do not run passes that assume a closed world. (We might later refine them to find which types don't escape and only optimize those.) The RemoveUnusedModuleElements is an exception in that the closed-world flag influences one part of its operation, but not the rest. Fixes #5292
* Add a placeholder closed-world flag (#5298)Alon Zakai2022-11-291-0/+12
| | | The flag does nothing so far.
* Remove equirecursive typing (#5240)Thomas Lively2022-11-233-25/+7
| | | | Equirecursive is no longer standards track and its implementation is extremely complex. Remove it.
* Do not compare reference values across executions (#5276)Thomas Lively2022-11-171-18/+10
| | | | | | | Since we optimize assuming a closed world, optimizations can change the types and structure of GC data even in externally-visible ways. Because differences are expected, the fuzzer already did not compare reference-typed values from before and after optimizations when running with nominal typing. Update it to not compare these values under any type system.
* [wasm-split] Improve the error message for bad checksums (#5268)Thomas Lively2022-11-161-2/+2
| | | | The previous error message was ambiguous and could easily be interpreted to mean the opposite of what it meant.
* Switch from `typedef` to `using` in C++ code. NFC (#5258)Sam Clegg2022-11-152-2/+2
| | | | This is more modern and (IMHO) easier to read than that old C typedef syntax.
* Update default features to match new llvm defaults (#5212)Sam Clegg2022-11-031-2/+2
| | | See: https://reviews.llvm.org/D125728
* Implement `array` basic heap type (#5148)Thomas Lively2022-10-182-12/+25
| | | | | | | | | `array` is the supertype of all defined array types and for now is a subtype of `data`. (Once `data` becomes `struct` this will no longer be true.) Update the binary and text parsing of `array.len` to ignore the obsolete type annotation and update the binary emitting to emit a zero in place of the old type annotation and the text printing to print an arbitrary heap type for the annotation. A follow-on PR will add support for the newer unannotated version of `array.len`.
* Make `Name` a pointer, length pair (#5122)Thomas Lively2022-10-1114-46/+44
| | | | | | | | | | | | | | | | | | | | | | | With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char*`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char*` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char*` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char*` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
* Implement bottom heap types (#5115)Thomas Lively2022-10-074-24/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.
* [Fuzzing] Allow recombine() to replace with a subtype (#5101)Alon Zakai2022-10-031-4/+43
| | | | Previously it would randomly replace an expression with another one with the exact same type. Allowing a subtype may give us more coverage.
* Refactor interaction between Pass and PassRunner (#5093)Thomas Lively2022-09-305-8/+12
| | | | | | | | | | | | | | Previously only WalkerPasses had access to the `getPassRunner` and `getPassOptions` methods. Move those methods to `Pass` so all passes can use them. As a result, the `PassRunner` passed to `Pass::run` and `Pass::runOnFunction` is no longer necessary, so remove it. Also update `Pass::create` to return a unique_ptr, which is more efficient than having it return a raw pointer only to have the `PassRunner` wrap that raw pointer in a `unique_ptr`. Delete the unused template `PassRunner::getLast()`, which looks like it was intended to enable retrieving previous analyses and has been in the code base since 2015 but is not implemented anywhere.
* Temporarily restore the typed-function-references flags as no-ops (#5050)Thomas Lively2022-09-161-0/+20
| | | | | This allows a three-step upgrade process where binaryen is updated with this change, then users remove their use of these flags, then binaryen can remove the flags permanently.
* Multi-Memories wasm-split (#4977)Ashley Nelson2022-09-155-28/+118
| | | Adds an --in-secondary-memory switch to the wasm-split tool that allows profile data to be stored in a separate memory from module main memory. With this option, users do not need to reserve the initial memory region for profile data and the data can be shared between multiple threads.
* Remove typed-function-references feature (#5030)Thomas Lively2022-09-092-8/+3
| | | | | | | | | | | | | | | | In practice typed function references will not ship before GC and is not independently useful, so it's not necessary to have a separate feature for it. Roll the functionality previously enabled by --enable-typed-function-references into --enable-gc instead. This also avoids a problem with the ongoing implementation of the new GC bottom heap types. That change will make all ref.null instructions in Binaryen IR refer to one of the bottom heap types. But since those bottom types are introduced in GC, it's not valid to emit them in binaries unless unless GC is enabled. The fix if only reference types is enabled is to emit (ref.null func) instead of (ref.null nofunc), but that doesn't always work if typed function references are enabled because a function type more specific than func may be required. Getting rid of typed function references as a separate feature makes this a nonissue.
* Changing Fatal() to assert() (#4982)Ashley Nelson2022-09-091-3/+1
| | | Replacing Fatal() call sites in src/shell-interface.h & src/tools/wasm-ctor-eval.cpp that were added in the Multi-Memories PR with assert()
* [NFC] Remove unused code in type fuzzer (#5023)Thomas Lively2022-09-071-67/+0
| | | | | The only call to `generateSubBasic` was removed as part of a bug fix in #4346, but the function itself was not removed. Remove it and other unused functions it depends on now.
* Update fuzzer to newer GC spec regarding JS interop (#4965)Alon Zakai2022-08-311-7/+24
| | | | Do not export functions that have types not allowed in the rules for JS interop. Only very few GC types can be on the JS boundary atm.
* [Wasm GC] Support non-nullable locals in the "1a" form (#4959)Alon Zakai2022-08-312-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An overview of this is in the README in the diff here (conveniently, it is near the top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps things easy to reason about - what validates is what is valid wasm - but there are some minor nuances as mentioned there, in particular, we ignore nameless blocks (which are commonly added by various passes; ignoring them means we can keep more locals non-nullable). The key addition here is LocalStructuralDominance which checks which local indexes have the "structural dominance" property of 1a, that is, that each get has a set in its block or an outer block that precedes it. I optimized that function quite a lot to reduce the overhead of running that logic after each pass. The overhead is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we shrink code size, so there is less work actually, and it balances out). Since we run fixups after each pass, this PR removes logic to manually call the fixup code from various places we used to call it (like eh-utils and various passes). Various passes are now marked as requiresNonNullableLocalFixups => false. That lets us skip running the fixups after them, which we normally do automatically. This helps avoid overhead. Most passes still need the fixups, though - any pass that adds a local, or a named block, or moves code around, likely does. This removes a hack in SimplifyLocals that is no longer needed. Before we worked to avoid moving a set into a try, as it might not validate. Now, we just do it and let fixups happen automatically if they need to: in the common code they probably don't, so the extra complexity seems not worth it. Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a nondefaultable local. But we have the logic to fix that up now, and opts will likely keep it non-nullable as well. Various tests end up updated here because now a local can be non-nullable - previous fixups are no longer needed. Note that this doesn't remove the gc-nn-locals feature. That has been useful for testing, and may still be useful in the future - it basically just allows nn locals in all positions (that can't read the null default value at the entry). We can consider removing it separately. Fixes #4824
* Adding Multi-Memories Wasm Feature (#4968)Ashley Nelson2022-08-251-0/+1
| | | Adding multi-memories to the the list of wasm-features.
* Fuzzer simplification: Remove trap-ignoring logic (#4958)Alon Zakai2022-08-242-16/+3
| | | | | | | | | | | | | The "ignore trap" logic there is not close to enough for what we'd need to actually fuzz in a way that ignores traps, so this removes it. Atm that logic just allows a trap to happen without causing an error (that is, when comparing two results, one might trap and the other not, but they'd still be considered "equal"). But due to how we optimize traps in TrapsNeverHappens mode, the optimizer is free to assume the trap never occurs, which might remove side effects that are noticeable later. To actually handle that, we'd need to refactor the code to retain results per function (including the Loggings) and then to ignore everything from the very first trapping function. That is somewhat complicated to do, and a simpler thing is done in #4936, so we won't need it here.
* Separate `func` into a separate type hierarchy (#4955)Thomas Lively2022-08-222-33/+12
| | | | | Just like `extern` is no longer a subtype of `any` in the new GC type system, `func` is no longer a subtype of `any`, either. Make that change in our type system implementation and update tests and fuzzers accordingly.
* Materialize non-null externrefs in the fuzzer (#4952)Thomas Lively2022-08-221-2/+7
| | | | | | | Some fuzzer initial contents contain non-nullable externrefs that cause the fuzzer to try to materialize non-nullable externref values. Perviously the fuzzer did not support this and crashed with an assertion failure. Fix the assertion failure by instead returning a null cast to non-null, which will trap at runtime but at least produce a valid module.
* Restore the `extern` heap type (#4898)Thomas Lively2022-08-173-46/+96
| | | | | | | The GC proposal has split `any` and `extern` back into two separate types, so reintroduce `HeapType::ext` to represent `extern`. Before it was originally removed in #4633, externref was a subtype of anyref, but now it is not. Now that we have separate heaptype type hierarchies, make `HeapType::getLeastUpperBound` fallible as well.
* Mutli-Memories Support in IR (#4811)Ashley Nelson2022-08-174-123/+216
| | | | | | | This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction. It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format. There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
* Remove metadata generation from wasm-emscripten-finalize (#4863)Sam Clegg2022-08-071-31/+3
| | | | This is no longer needed by emscripten as of: https://github.com/emscripten-core/emscripten/pull/16529
* Remove RTTs (#4848)Thomas Lively2022-08-056-72/+14
| | | | | | | RTTs were removed from the GC spec and if they are added back in in the future, they will be heap types rather than value types as in our implementation. Updating our implementation to have RTTs be heap types would have been more work than deleting them for questionable benefit since we don't know how long it will be before they are specced again.
* [NFC] wasm-reduce: Avoid wasted work on drops (#4850)Alon Zakai2022-07-291-0/+7
| | | | | | It was wasted work to see a drop and then check if we can replace it with a drop of its child, which is identical to the original state. This didn't cause any harm (we'd not reduce code size, and stop eventually) but it did slow us down.
* wasm-reduce: Apply commandline features (#4833)Alon Zakai2022-07-261-3/+11
| | | | | This lets wasm-reduce --enable-FOO work. Usually this is not needed as we do enable all features by default, but sometimes it is nice to disable features (e.g. to avoid reducing into a testcase that uses something the original wasm did not use).
* [wasm-split] Add --print-profile option (#4771)sps-gold2022-07-253-19/+118
| | | | | | | | | | | | | | | | | | | | | | | There are several reasons why a function may not be trained in deterministically. So to perform quick validation we need to inspect profile.data (another ways requires split to be performed). However as profile.data is a binary file and is not self sufficient, so we cannot currently use it to perform such validation. Therefore to allow quick check on whether a particular function has been trained in, we need to dump profile.data in a more readable format. This PR, allows us to output, the list of functions to be kept (in main wasm) and those split functions (to be moved to deferred.wasm) in a readable format, to console. Added a new option `--print-profile` - input path to orig.wasm (its the original wasm file that will be used later during split) - input path to profile.data that we need to output optionally pass `--unescape` to unescape the function names Usage: ``` binaryen\build>bin\wasm-split.exe test\profile_data\MY.orig.wasm --print-profile=test\profile_data\profile.data > test\profile_data\out.log ``` note: meaning of prefixes `+` => fn to be kept in main wasm `-` => fn to be split and moved to deferred wasm
* Remove basic reference types (#4802)Thomas Lively2022-07-205-112/+28
| | | | | | | | | Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to accidentally forget to handle reference types with the same basic HeapTypes but the opposite nullability. In principle there is nothing special about the types with shorthands except in the binary and text formats. Removing these shorthands from the internal type representation by removing all basic reference types makes some code more complicated locally, but simplifies code globally and encourages properly handling both nullable and non-nullable reference types.
* [Strings] Add feature flag for Strings proposal (#4766)Alon Zakai2022-06-301-0/+1
|
* Fix more no-assertions warnings (#4765)Alon Zakai2022-06-302-1/+3
|
* [Strings] Add string proposal types (#4755)Alon Zakai2022-06-293-2/+21
| | | | | | | | This starts to implement the Wasm Strings proposal https://github.com/WebAssembly/stringref/blob/main/proposals/stringref/Overview.md This just adds the types.
* Disallow --nominal with GC (#4758)Thomas Lively2022-06-281-0/+6
| | | | | | | | | | | Nominal types don't make much sense without GC, and in particular trying to emit them with typed function references but not GC enabled can result in invalid binaries because nominal types do not respect the type ordering constraints required by the typed function references proposal. Making this change was mostly straightforward, but required fixing the fuzzer to use --nominal only when GC is enabled and required exiting early from nominal-only optimizations when GC was not enabled. Fixes #4756.
* First class Data Segments (#4733)Ashley Nelson2022-06-214-40/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Updating wasm.h/cpp for DataSegments * Updating wasm-binary.h/cpp for DataSegments * Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal * checking isPassive when copying data segments to know whether to construct the data segment with an offset or not * Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp * Updated wasm-interpreter * First look at updating Passes * Updated wasm-s-parser * Updated files in src/ir * Updating tools files * Last pass on src files before building * added visitDataSegment * Fixing build errors * Data segments need a name * fixing var name * ran clang-format * Ensuring a name on DataSegment * Ensuring more datasegments have names * Adding explicit name support * Fix fuzzing name * Outputting data name in wasm binary only if explicit * Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames * Pass on when data segment names are explicitly set * Ran auto_update_tests.py and check.py, success all around * Removed an errant semi-colon and corrected a counter. Everything still passes * Linting * Fixing processing memory names after parsed from binary * Updating the test from the last fix * Correcting error comment * Impl kripken@ comments * Impl tlively@ comments * Updated tests that remove data print when == 0 * Ran clang format * Impl tlively@ comments * Ran clang-format
* Reducer: Support --hybrid (#4726)Alon Zakai2022-06-141-0/+3
|
* [Parser] Begin parsing modules (#4716)Thomas Lively2022-06-101-0/+6
| | | | | | | | | | | Implement the basic infrastructure for the full WAT parser with just enough detail to parse basic modules that contain only imported globals. Parsing functions correspond to elements of the grammar in the text specification and are templatized over context types that correspond to each phase of parsing. Errors are explicitly propagated via `Result<T>` and `MaybeResult<T>` types. Follow-on PRs will implement additional phases of parsing and parsing for new elements in the grammar.
* Fuzzer: Add support for creating structs and arrays in makeConst (#4707)Alon Zakai2022-06-011-8/+20
| | | | | | | | #4659 adds a testcase with an import of (ref $struct). This could cause an error in the fuzzer, since it wants to remove imports (because the various fuzzers cannot pass in custom imports - they want to just run the wasm). When it tries to remove that import it tries to create a constant for a struct reference, and fails. To fix that, add enough support to create structs and arrays at least in the simple case where all their fields are defaultable.
* Fuzzer: Refactor makeConst into separate functions [NFC] (#4709)Alon Zakai2022-06-012-85/+118
| | | This just moves code around + adds assertions.
* Remove renameMainArgcArgv from wasm-emscripten-finalize (#4700)Sam Clegg2022-05-311-6/+0
| | | | | | | | | This part to finalize is currently not used and was added in preparation for https://reviews.llvm.org/D75277. However, the better solution to dealing with this alternative name for main is on the emscripten side. The main reason for this is that doing the rename here in binaryen would require finalize to always re-write the binary, which is expensive.
* Validator: Check features for ref.null's type (#4677)Alon Zakai2022-05-181-0/+2
|
* [GC Fuzzing] Avoid non-nullable eqref without GC (#4675)Alon Zakai2022-05-181-2/+22
| | | | | | With only reference types but not GC, we cannot easily create a constant for eqref for example. Only GC adds i31.new etc. To avoid assertions in the fuzzer, avoid randomly picking (ref eq) etc., that is, keep it nullable so that we can emit a (ref.null eq) if we need a constant value of that type.
* wasm-reduce: Fix order in shrinkByReduction call (#4673)Alon Zakai2022-05-171-1/+4
| | | | | | The old code would short-circuit and not do anything after we managed any reduction in the loop here. That would end up doing entire iterations of the whole pipeline before removing another element segment, which could be slow.
* [Fuzzer] Reduce trap probability in function ref fallback code (#4653)Alon Zakai2022-05-161-10/+15
| | | | | | Also improve comments. As suggested in #4647
* [Fuzzer] Fix another reference types vs gc types issue (#4647)Alon Zakai2022-05-061-36/+37
| | | | | | | | | | Diff without whitespace is smaller. We can't emit HeapType::data without GC. Fixing that by switching to func, another problem was uncovered: makeRefFuncConst had a TODO to handle the case where we need a function to refer to but have created none yet. In fact that TODO was done at the end of the function. Fix up the logic in between to actually get there.
* Fix fuzzer's choosing of reference types (#4642)Alon Zakai2022-05-051-7/+18
| | | | | | * Don't emit "i31" or "data" if GC is not enabled, as only the GC feature adds those. * Don't emit "any" without GC either. While it is allowed, fuzzer limitations prevent this atm (see details in comment - it's fixable).