summaryrefslogtreecommitdiff
path: root/src/wasm
Commit message (Collapse)AuthorAgeFilesLines
* [NFC] Improve debug printing for type canonicalization (#5465)Thomas Lively2023-01-301-2/+3
| | | | | | Use an `IndexedTypeNameGenerator` to give types stable names for the entire `dump` method rather than generating fresh type names every time a single type is printed. This makes it possible to understand the relationships between the types in the debug output.
* [Strings] Add experimental StringNew variants (#5459)Alon Zakai2023-01-265-35/+69
| | | | | | string.from_code_point makes a string from an int code point. string.new_utf8*_try makes a utf8 string and returns null on a UTF8 encoding error rather than trap.
* [Strings] Add string.compare (#5453)Alon Zakai2023-01-254-8/+24
| | | See WebAssembly/stringref#58
* Fix segment fault in API BinaryenModuleParse (#5440) (#5441)Changqing Jing2023-01-201-10/+12
| | | | | | We cannot modify the input string safely. To avoid that, copy where needed. Fixes #5440
* Add a TypeNameGenerator that uses names from a Module (#5437)Thomas Lively2023-01-181-1/+5
| | | | | | | | | If the module does not have a name for a particular type, the new utility falls back to use a different user-configurable type name generator, just like the existing IndexedTypeNameGenerator does. Also change how heap types are printed by this printing machinery (which is currently only used for debugging) so that their names are printed in addition to their contents. This makes the printer much more useful for debugging.
* [Wasm GC] Support and fuzz function subtyping (#5420)Thomas Lively2023-01-121-9/+1
| | | | | | | | | | Support function subtyping with contravariant parameters and covariant results. The actual change is a single line in wasm-type.cpp, so most of the patch is updating the type fuzzer to generate interesting function subtypes. Since function parameters are covariant, generating a function subtype requires generating supertypes of its parameter types, which required new functionality in the fuzzer. Also update the fuzzer to choose to reuse types at a finer grain, so for example individual function parameters or results might be reused unmodified while other parameters or results are still modified.
* [Wasm GC] Handle an unreachable br_on_cast_fail in DCE (#5418)Alon Zakai2023-01-111-1/+4
| | | Without this we hit an assertion on unreachable not being a heap type.
* [Wasm GC] Replace `HeapType::data` with `HeapType::struct_` (#5416)Thomas Lively2023-01-106-78/+50
| | | | | | `struct` has replaced `data` in the upstream spec, so update Binaryen's types to match. We had already supported `struct` as an alias for data, but now remove support for `data` entirely. Also remove instructions like `ref.is_data` that are deprecated and do not make sense without a `data` type.
* Represent ref.as_{func,data,i31} with RefCast (#5413)Thomas Lively2023-01-105-42/+59
| | | | | | | | | | | | | These operations are deprecated and directly representable as casts, so remove their opcodes in the internal IR and parse them as casts instead. For now, add logic to the printing and binary writing of RefCast to continue emitting the legacy instructions to minimize test changes. The few test changes necessary are because it is no longer valid to perform a ref.as_func on values outside the func type hierarchy now that ref.as_func is subject to the ref.cast validation rules. RefAsExternInternalize, RefAsExternExternalize, and RefAsNonNull are left unmodified. A future PR may remove RefAsNonNull as well, since it is also expressible with casts.
* Replace `RefIs` with `RefIsNull` (#5401)Thomas Lively2023-01-096-54/+65
| | | | | | | | | | | | | | | * Replace `RefIs` with `RefIsNull` The other `ref.is*` instructions are deprecated and expressible in terms of `ref.test`. Update binary and text parsing to parse those instructions as `RefTest` expressions. Also update the printing and emitting of `RefTest` expressions to emit the legacy instructions for now to minimize test changes and make this a mostly non-functional change. Since `ref.is_null` is the only `RefIs` instruction left, remove the `RefIsOp` field and rename the expression class to `RefIsNull`. The few test changes are due to the fact that `ref.is*` instructions are now subject to `ref.test` validation, and in particular it is no longer valid to perform a `ref.is_func` on a value outside of the `func` type hierarchy.
* Consolidate br_on* operations (#5399)Thomas Lively2023-01-065-76/+94
| | | | | | | | | | | | | | | | | | The `br_on{_non}_{data,i31,func}` operations are deprecated and directly representable in terms of the new `br_on_cast` and `br_on_cast_fail` instructions, so remove their dedicated IR opcodes in favor of representing them as casts. `br_on_null` and `br_on_non_null` cannot be consolidated the same way because their behavior is not directly representable in terms of `br_on_cast` and `br_on_cast_fail`; when the cast to null bottom type succeeds, the null check instructions implicitly drop the null value whereas the cast instructions would propagate it. Add special logic to the binary writer and printer to continue emitting the deprecated instructions for now. This will allow us to update the test suite in a separate future PR with no additional functional changes. Some tests are updated because the validator no longer allows passing non-func data to `br_on_func`. Doing so has not made sense since we separated the three reference type hierarchies.
* Support br_on_cast null (#5397)Thomas Lively2023-01-055-24/+75
| | | | | | | | | As well as br_on_cast_fail null. Unlike the existing br_on_cast* instructions, these new instructions treat the cast as succeeding when the input is a null. Update the internal representation of the cast type in `BrOn` expressions to be a `Type` rather than a `HeapType` so it will include nullability information. Also update and improve `RemoveUnusedBrs` to handle the new instructions correctly and optimize in more cases.
* [Parser] Parse blocks (#5393)Thomas Lively2023-01-051-17/+231
| | | | | | | | | | | | | Parse both the folded and unfolded forms of blocks and structure the code to make supporting additional block instructions like if-else and try-catch relatively simple. Parsing block types is extra fun because they may implicitly define new signature heap types via a typeuse, but only if their types are not given by a single result type. To figuring out whether a new type may be introduced in all the relevant parsing stages, always track at least the arity of parsed results. The parser parses block labels, but more work will be required to support branch instructions that use them.
* Allow non-nullable ref.cast of nullable references (#5386)Thomas Lively2023-01-044-22/+6
| | | | | | | This new cast configuration was not expressible with the legacy cast instructions. Although it is valid in Wasm, do not allow nullable casts of non-nullable references, since those would unnecessarily lose type information. Convert such casts to be non-nullable during expression finalization.
* [Parser] Parse array access instructions (#5375)Thomas Lively2023-01-031-5/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * [NFC][Parser] Track definition indices For each definition in a module, record that definition's index in the relevant index space. Previously the index was inferred from its position in a list of module definitions, but that scheme does not scale to data segments defined inline inside memory definitions because these data segments occupy a slot in the data segment index space but do not have their own independent definitions. * clarify comment * [Parser] Parse data segments Parse active and passive data segments, including all their variations and abbreviations as well as data segments declared inline in memory declarations. Switch to parsing data strings, memory limits, and memory types during the ParseDecls phase so that the inline data segments can be completely parsed during that phase and never revisited. Parsing the inline data segments in a later phase would not work because they would be incorrectly inserted at the end of the data segment index space. Also update the printer to print a memory use on active data segments that are initialized in a non-default memory. * [Parser] Parse array creation and data segment instructions * [Parser] Parse array access instructions
* [Parser] Parse array creation and data segment instructions (#5374)Thomas Lively2023-01-031-8/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | * [NFC][Parser] Track definition indices For each definition in a module, record that definition's index in the relevant index space. Previously the index was inferred from its position in a list of module definitions, but that scheme does not scale to data segments defined inline inside memory definitions because these data segments occupy a slot in the data segment index space but do not have their own independent definitions. * clarify comment * [Parser] Parse data segments Parse active and passive data segments, including all their variations and abbreviations as well as data segments declared inline in memory declarations. Switch to parsing data strings, memory limits, and memory types during the ParseDecls phase so that the inline data segments can be completely parsed during that phase and never revisited. Parsing the inline data segments in a later phase would not work because they would be incorrectly inserted at the end of the data segment index space. Also update the printer to print a memory use on active data segments that are initialized in a non-default memory. * [Parser] Parse array creation and data segment instructions
* [Parser] Parse data segments (#5373)Thomas Lively2023-01-031-35/+188
| | | | | | | | | | | | | | | | | | | | | | | | * [NFC][Parser] Track definition indices For each definition in a module, record that definition's index in the relevant index space. Previously the index was inferred from its position in a list of module definitions, but that scheme does not scale to data segments defined inline inside memory definitions because these data segments occupy a slot in the data segment index space but do not have their own independent definitions. * clarify comment * [Parser] Parse data segments Parse active and passive data segments, including all their variations and abbreviations as well as data segments declared inline in memory declarations. Switch to parsing data strings, memory limits, and memory types during the ParseDecls phase so that the inline data segments can be completely parsed during that phase and never revisited. Parsing the inline data segments in a later phase would not work because they would be incorrectly inserted at the end of the data segment index space. Also update the printer to print a memory use on active data segments that are initialized in a non-default memory.
* [NFC][Parser] Track definition indices (#5372)Thomas Lively2023-01-031-13/+17
| | | | | | | | | | | * [NFC][Parser] Track definition indices For each definition in a module, record that definition's index in the relevant index space. Previously the index was inferred from its position in a list of module definitions, but that scheme does not scale to data segments defined inline inside memory definitions because these data segments occupy a slot in the data segment index space but do not have their own independent definitions. * clarify comment
* Support `ref.test null` (#5368)Thomas Lively2022-12-214-9/+22
| | | This new variant of ref.test returns 1 if the input is null.
* Update RefCast representation to drop extra HeapType (#5350)Thomas Lively2022-12-205-23/+44
| | | | | | | | | The latest upstream version of ref.cast is parameterized with a target reference type, not just a heap type, because the nullability of the result is parameterizable. As a first step toward implementing these new, more flexible ref.cast instructions, change the internal representation of ref.cast to use the expression type as the cast target rather than storing a separate heap type field. For now require that the encoded semantics match the previously allowed semantics, though, so that none of the optimization passes need to be updated.
* [Wasm GC] Do not cache signature types in nominal mode if they have a super ↵Alon Zakai2022-12-191-1/+5
| | | | | | | (#5364) This reduces the amount of public types, since if there is a super then using the type in a public place would make the super also public. It is safer for closed-world mode to reuse types without supers.
* Do not optimize public types (#5347)Thomas Lively2022-12-161-0/+39
| | | | | | | | | | | | | | | | | Do not optimize or modify public heap types in any way. Public heap types include the types of imported or exported functions, tables, globals, etc. This is important to maintain the public interface of a module and ensure it can still link interact as intended with the outside world. Also add validation error if we find any nontrivial public types that are not the types of imported or exported functions. This error is meant to help the user ensure that type optimizations are not silently inhibited. In the future, we may want to add options to silence this error or downgrade it to a warning. This commit only updates the type updating machinery to avoid updating public types. It does not update any optimization passes accordingly. Since we avoid modifying public signature types already, this is not expected to break anything, but in the future once we have function subtyping or if we make the error optional, we may have to update some of our optimization passes.
* Use non-nullable ref.cast for non-nullable input (#5335)Thomas Lively2022-12-093-5/+33
| | | | | | | | | | | | We switched from emitting the legacy `ref.cast_static` instruction to emitting `ref.cast null` in #5331, but that wasn't quite correct. The legacy instruction had polymorphic typing so that its output type was nullable if and only if its input type was nullable. In contrast, `ref.cast null` always has a a nullable output type. Fix our output by instead emitting non-nullable `ref.cast` if the output should be non-nullable. Parse `ref.cast` in binary and text forms as well. Since the IR can only represent the legacy polymorphic semantics, disallow unsupported casts from nullable to non-nullable references or vice versa for now.
* Allow casting to basic heap types (#5332)Thomas Lively2022-12-083-34/+44
| | | | | | | The standard casting instructions now allow casting to basic heap types, not just user-defined types, but they also require that the intended type and argument type have a common supertype. Update the validator to use the standard rules, update the binary parser and printer to allow basic types, and update the tests to remove or modify newly invalid test cases.
* Validate ref.as_* argument is a reference (#5330)Alon Zakai2022-12-081-1/+5
| | | | | | | | | | | | | | Without this we hit an assert with no line number info (or in a no-asserts build, bad things can happen). With this: $ bin/wasm-opt -all ~/Downloads/crash.wat --nominal [parse exception: Invalid ref for ref.as (at 155065:119)] Fatal: error parsing wasm (That can only happen for ref.as_non_null, as all the others do not have that assert - their types do not depend on the child's type, so their finalize does not error. Still, it is nice to validate earlier for them as well, so this PR handles them all.)
* Add standard versions of WasmGC casts (#5331)Thomas Lively2022-12-074-63/+42
| | | | | | | We previously supported only the non-standard cast instructions introduced when we were experimenting with nominal types. Parse the names and opcodes of their standard counterparts and switch to emitting the standard names and opcodes. Port all of the tests to use the standard instructions, but add additional tests showing that the non-standard versions are still parsed correctly.
* [Parser][NFC] Add `Idx` to type aliases representing indices (#5326)Thomas Lively2022-12-061-38/+42
| | | | | | | Previously we had types like `LocalT` and `MemoryT` to represent references to locals and memories, but when we added field indices in #5255, we had to use `FieldIdxT` instead of `FieldT` because `FieldT` was already in use as the type representing a field itself. Update `LocalT`, `MemoryT` and `GlobalT` to have `Idx` in their names to be consistent with `FieldIdxT`.
* [NFC] Do not read past the end of a string_view (#5317)Thomas Lively2022-12-021-5/+5
| | | | | | | | wasm-s-parser.cpp was detecting the end of type strings by looking for null characters, but those null characters would be past the end of the relevant string_view. Bring that code in line with similar code by checking the length of the string_view instead. Fixes an assertion failure in MSVC debug mode. Fixes #5312.
* [Parser] Avoid calling `strtod` on NaNs entirely (#5316)Thomas Lively2022-12-021-5/+6
| | | | | | | | MSVC's implementation of `strtod` doesn't return a negative Nan for "-nan", so we already had a workaround to explicitly handle that case without calling `strtod`. Unfortunately the workaround was not used for negative NaNs with payloads, so there were still bugs. Fix the problem and make the code even more portable by avoiding `strtod` completely for any kind of nan, positive or negative, with or without payload.
* Remove more uses of NAN (#5310)Thomas Lively2022-12-021-0/+2
| | | | | In favor of the more portable code snippet using `std::copysign`. Also reintroduce assertions that the NaNs have the expected signs. This continues work started in #5302.
* Use C++17's [[maybe_unused]]. NFC (#5309)Sam Clegg2022-12-021-2/+1
|
* [Parser] Do not assume `NAN` is positive (#5302)Thomas Lively2022-11-291-3/+4
| | | | | | | It turns out that this assumption does not necessarily hold on Windows with Visual Studio 2019. Instead of using `NAN` and `-NAN`, explicitly construct positive and negative NaN values with `std::copysign`, which should be portable. Fixes #5291.
* Fix validation and inlining bugs (#5301)Thomas Lively2022-11-291-2/+5
| | | | | | | | | | | | | | Inlining had a bug where it gave return_calls in inlined callees concrete types even when they should have remained unreachable. This bug flew under the radar because validation had a bug where it allowed expressions to have concrete types when they should have been unreachable. The fuzzer found this bug by adding another pass after inlining where the unexpected types caused an assertion failure. Fix the bugs and add a test that would have triggered the inlining bug. Unfortunately the test would have also passed before this change due to the validation bug, but it's better than nothing. Fixes #5294.
* Validator: Print the field number on subtyping errors (#5297)Alon Zakai2022-11-291-4/+6
|
* Remove equirecursive typing (#5240)Thomas Lively2022-11-232-1062/+19
| | | | Equirecursive is no longer standards track and its implementation is extremely complex. Remove it.
* Change the default type system to isorecursive (#5239)Thomas Lively2022-11-231-1/+1
| | | | | | | | | | This makes Binaryen's default type system match the WasmGC spec. Update the way type definitions without supertypes are printed to reduce the output diff for MVP tests that do not involve WasmGC. Also port some type-builder.cpp tests from test/example to test/gtest since they needed to be rewritten to work with isorecursive type anyway. A follow-on PR will remove equirecursive types completely.
* Rename UserSection -> CustomSection. NFC (#5288)Sam Clegg2022-11-223-96/+96
| | | This reflects that naming used in the spec.
* [NFC] Expand comment about validating function type features (#5286)Thomas Lively2022-11-221-1/+3
| | | This addresses feedback missed in #5279.
* Validate that GC is enabled for rec groups and supertypes (#5279)Thomas Lively2022-11-222-8/+13
| | | | | | | | | Update `HeapType::getFeatures` to report that GC is used for heap types that have nontrivial recursion groups or supertypes. Update validation to check the features on function heap types, not just their individual params and results. This fixes a fuzz bug in #5239 where initial contents included a rec group but the fuzzer disabled GC. Since the resulting module passed validation, the rec groups made it into the binary output, making the type section malformed.
* Fix isorecursive canonicalization (#5269)Thomas Lively2022-11-171-5/+4
| | | | | | | | | | | | | | Fixes a longstanding problem with isorecursive canonicalization that only showed up in MacOS and occasionally Windows builds. The problem was that `RecGroupEquator` was not quite correct in the presence of self-references in rec groups. Specifically, `RecGroupEquator` did not differentiate between instances of the same type appearing across two rec groups where the type was a self-reference in one group but not in the other. The reason this only showed up occasionally on some platforms was that this bug could only cause incorrect behavior if two groups that would incorrectly be compared as equal were hashed into the same bucket of a hash map. Apparently the hash map used on Linux never hashes the two problematic groups into the same bucket.
* Revert "Revert "Make `call_ref` type annotations mandatory (#5246)" (#5265)" ↵Thomas Lively2022-11-162-50/+16
| | | | | (#5266) This reverts commit 570007dbecf86db5ddba8d303896d841fc2b2d27.
* Revert "Make `call_ref` type annotations mandatory (#5246)" (#5265)Thomas Lively2022-11-162-16/+50
| | | | | This reverts commit b2054b72b7daa89b7ad161c0693befad06a20c90. It looks like the necessary V8 change has not rolled out everywhere yet.
* Fix an unused var warning in some compilers (#5260)Alon Zakai2022-11-151-2/+1
|
* Switch from `typedef` to `using` in C++ code. NFC (#5258)Sam Clegg2022-11-151-1/+1
| | | | This is more modern and (IMHO) easier to read than that old C typedef syntax.
* [Parser] Parse struct allocation and accessor instructions (#5255)Thomas Lively2022-11-151-7/+128
| | | | | Including support for parsing field indices. Although only numeric field indices are supported at the moment, set up the code to make it straightforward to implement type-dependent symbolic field names in the future.
* Make `call_ref` type annotations mandatory (#5246)Thomas Lively2022-11-152-50/+16
| | | | They were optional for a while to allow users to gracefully transition to using them, but now make them mandatory to match the upstream WasmGC spec.
* [Parser] Parse `ref.is*`, `ref.eq`, `i31.new`, and `i31.get*` (#5247)Thomas Lively2022-11-141-4/+36
|
* Implement `array.new_data` and `array.new_elem` (#5214)Thomas Lively2022-11-077-10/+184
| | | | | | | | | In order to test them, fix the binary and text parsers to accept passive data segments even if a module has no memory. In addition to parsing and emitting the new instructions, also implement their validation and interpretation. Test the interpretation directly with wasm-shell tests adapted from the upstream spec tests. Running the upstream spec tests directly would require fixing too many bugs in the legacy text parser, so it will have to wait for the new text parser to be ready.
* Fix binary parsing of data segment memory (#5208)Thomas Lively2022-11-032-5/+6
| | | | | | | | | | | | The binary parser was eagerly getting the name of memories to set the `memory` field of data segments, but that meant that when the memory names were updated later while parsing the names section, the data segment memory fields would become out of date. Update the issue by deferring setting the `memory` fields like we do for other parts of IR that reference memories. Also fix a segfault in the validator that was triggered by the reproducer for this bug before the bug was fixed. Fixes #5204.
* [NFC] Mention relevant flags in validator errors (#5203)Alon Zakai2022-11-011-93/+116
| | | | | | | | | | E.g. Atomic operation (atomics are disabled) => Atomic operations require threads [--enable-threads]