summaryrefslogtreecommitdiff
path: root/src/wasm/wasm-binary.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [typed-cont] Add feature flag (#5996)Frank Emrich2023-10-051-0/+5
| | | | | | | This PR is part of a series that adds basic support for the [typed continuations proposal](https://github.com/wasmfx/specfx). This particular PR simply extends `FeatureSet` with a corresponding entry for this proposal.
* Error on multivalue inputs that we do not handle (#5962)Alon Zakai2023-09-201-2/+6
| | | | | | Before in getType() we silently dropped the params of a signature type. Now we verify that it is none, or we error. Helps #5950
* Implement table.fill (#5949)Thomas Lively2023-09-181-0/+20
| | | | | | | | This instruction was standardized as part of the bulk memory proposal, but we never implemented it until now. Leave similar instructions like table.copy as future work. Fixes #5939.
* Replace I31New with RefI31 everywhere (#5930)Thomas Lively2023-09-131-4/+4
| | | | | | | | Globally replace the source string "I31New" with "RefI31" in preparation for renaming the instruction from "i31.new" to "ref.i31", as implemented in the spec in https://github.com/WebAssembly/gc/pull/422. This would be NFC, except that it also changes the string in the external-facing C APIs. A follow-up PR will make the corresponding behavioral change.
* Make final types the default (#5918)Thomas Lively2023-09-091-9/+6
| | | | | | | | | | | | | Match the spec and parse the shorthand binary and text formats as final and emit final types without supertypes using the shorthands as well. This is a potentially-breaking change, since the text and binary shorthands can no longer be used to define types that have subtypes. Also make TypeBuilder entries final by default to better match the spec and update the internal APIs to use the "open" terminology rather than "final" terminology. Future changes will update the text format to use the standard "sub open" rather than the current "sub final" keywords. The exception is the new wat parser, which supporst "sub open" as of this change, since it didn't support final types at all previously.
* [NFC] Source maps: Simplify the code and add comments (#5912)Alon Zakai2023-09-061-21/+16
| | | | | | | | | | | | | | | Almost entirely trivial, except for this part: - if (nextDebugLocation.availablePos == 0 && - nextDebugLocation.previousPos <= pos) { + if (nextDebugLocation.availablePos == 0) { return; I believe removing the extra check has no effect. Removing it does not change anything in the test suite, and logically, if we set availablePos to 0 then we definitely want to return here - we set it to 0 to indicate there is nothing left to read, which is what the code after it does. As a result, we can remove the previousPos field entirely.
* Remove the GCNNLocals feature (#5080)Thomas Lively2023-08-311-3/+1
| | | | | Now that the WasmGC spec has settled on a way of validating non-nullable locals, we no longer need this experimental feature that allowed nonstandard uses of non-nullable locals.
* Parse non-nullable tuple elements without special handling (#5910)Thomas Lively2023-08-301-23/+5
| | | | | | | In the binary parser, when creating a scratch local to hold multivalue results as tuples, we previously ensured that the scratch local did not contain any non-nullable by modifying its type and inserting ref.as_non_null as necessary. Now that we properly support non-nullable elements in tuple locals, however, this parser behavior is no longer necessary. Remove it.
* Source maps: Fix locations without debug info between ones that do (#5906)Alon Zakai2023-08-301-10/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we did nothing for instructions without debug info. So if we had one that did, followed by one that didn't, the one that didn't could get "smeared" with the debug info of the first. Source map locations are the start of segments, apparently, and so if we say a location has info then all others after it will as well, until the next segment. To fix that, add support for source map entries with just one field, the binary location. Such entries have no debug info (no file:line:col), and though the source maps spec is not very clear on this, this seems like the right way to prevent this problem: to stop a segment with debug info by starting a new one without, when we know we don't want that info any more. That is, before this PR we could have this: ;; file.cpp:10:1 (nop) ;; binary offset 5 in wasm (nop) ;; binary offset 6 in wasm and the second nop would get the same debug annotation, since we just have one source map segment, [5, file.cpp, 10, 1] // start at offset 5 in wasm With this PR, we emit: [5, file.cpp, 10, 1] // start at offset 5 in wasm; file.cpp:10:1 [6] // start at offset 6 in wasm; no debug info This does add 7% to source map sizes, however, since we add those 1-length segments now, but that seems unavoidable to fix this bug. To implement this, add a new field that says if the next location in the source map has debug info or not, and use that.
* Rename multimemory flag (#5890)Ashley Nelson2023-08-211-4/+4
| | | Renaming the multimemory flag in Binaryen to match its naming in LLVM.
* Fix finalization of call_ref to handle refined target types (#5883)Thomas Lively2023-08-211-1/+4
| | | | | | | | | | Previously CallRef::finalize() would never update the type of the CallRef, even if the type of the call target had been refined to give a more precise result type. Besides unnecessarily losing type information, this could also lead to validation errors, since the validator checks that the type of CallRef matches the result type of the target signature. Fix the bug by updating CallRef's type based on its target signature in CallRef::finalize() and add a test that depends on this refinalization.
* Ensure br_on_cast* target type is subtype of input type (#5881)Thomas Lively2023-08-171-0/+3
| | | | | | | | | | | | | | | | The WasmGC spec will require that the target cast type of br_on_cast and br_on_cast_fail be a subtype of the input type, but so far Binaryen has not enforced this constraint, so it could produce invalid modules when optimizations refined the input to a br_on_cast* such that it was no longer a supertype of the cast target type. Fix this problem by setting the cast target type to be the greatest lower bound of the original cast target type and the current input type in `BrOn::finalize()`. This maintains the invariant that the cast target type should be a subtype of the input type and it also does not change cast behavior; any value that could make the original cast succeed at runtime necessarily inhabits both the original cast target type and the input type, so it also must inhabit their greatest lower bound and will make the updated cast succeed as well.
* Remove legacy WasmGC instructions (#5861)Thomas Lively2023-08-091-125/+22
| | | | | Remove old, experimental instructions and type encodings that will not be shipped as part of WasmGC. Updating the encodings and text format to match the final spec is left as future work.
* Fix binary writing of strings without GC enabled (#5836)Alon Zakai2023-07-311-4/+10
|
* Initial support for `final` types (#5803)Thomas Lively2023-07-061-5/+34
| | | | | | | | | | | | | | | | | | | | Implement support in the type system for final types, which are not allowed to have any subtypes. Final types are syntactically different from similar non-final types, so type canonicalization is made aware of finality. Similarly, TypeMerging and TypeSSA are updated to work correctly in the presence of final types as well. Implement binary and text parsing and emitting of final types. Use the standard text format to represent final types and interpret the non-standard "struct_subtype" and friends as non-final. This allows a graceful upgrade path for users currently using the non-standard text format, where they can update their code to use final types correctly at the point when they update to use the standard format. Once users have migrated to using the fully expanded standard text format, we can update update Binaryen's parsers to interpret the MVP shorthands as final types to match the spec without breaking those users. To make it safe for V8 to independently start interpreting types declared without `sub` as final, also reserve that shorthand encoding only for types that have no strict subtypes.
* [NFC] Fix the use of "strict" in subtypes.h (#5804)Thomas Lively2023-07-061-2/+1
| | | | | | | | Previously we incorrectly used "strict" to mean the immediate subtypes of a type, when in fact a strict subtype of a type is any subtype excluding the type itself. Rename the incorrect `getStrictSubTypes` to `getImmediateSubTypes`, rename the redundant `getAllStrictSubTypes` to `getStrictSubTypes`, and rename the redundant `getAllSubTypes` to `getSubTypes`. Fixing the capitalization of "SubType" to "Subtype" is left as future work.
* Rename WasmBinaryBuilder to WasmBinaryReader (NFC) (#5767)Heejin Ahn2023-06-131-203/+195
| | | | | | We have `WasmBinaryBuilder` that read binary into Binaryen IR and `WasmBinaryWriter` that writes Binaryen IR to binary. To me `WasmBinaryBuilder` sounds similar to `WasmBinaryWriter`, which builds binary. How about renaming it to `WasmBinaryReader`?
* Update br_on_cast binary and text format (#5762)Thomas Lively2023-06-121-14/+33
| | | | | | | | | | | | The final versions of the br_on_cast and br_on_cast_fail instructions have two reference type annotations: one for the input type and one for the cast target type. In the binary format, this is represented as a flags byte followed by two encoded heap types. Upgrade all of the tests at once to use the new versions of the instructions and drop support for the old instructions from the text parser. Keep support in the binary parser to avoid breaking users, though. Drop some binary tests of deprecated instruction encodings that would be more effort to update than they're worth. Re-land with fixes of #5734
* [Strings] Fix non-nullable string emitting in the binary format (#5756)Alon Zakai2023-06-071-3/+9
| | | Related to #5737 which did something similar for other types.
* Fix emitting of function reference types without GC (#5737)Thomas Lively2023-06-051-14/+17
| | | | | | | | | We previously had logic to emit GC types used in the IR as their corresponding top types when GC was not enabled (so e.g. nullfuncref would be emitted as funcref), but the logic was not robust enough and non-null function references were not properly emitted as funcref. Refactor the relevant code to be more robust and future-proof, and add a test demonstrating that the lowering works as intended.
* Revert "Update br_on_cast binary and text format (#5734)" (#5740)Alon Zakai2023-05-231-16/+14
| | | | | | | This reverts commit b7b1d0df29df14634d2c680d1d2c351b624b4fbb. See comment at the end of #5734: It turns out that dropping the old opcodes causes problems for current users, so let's revert this for now, and later we can figure out how best to do the update.
* Update br_on_cast binary and text format (#5734)Thomas Lively2023-05-191-14/+16
| | | | | | | | | | The final versions of the br_on_cast and br_on_cast_fail instructions have two reference type annotations: one for the input type and one for the cast target type. In the binary format, this is represented as a flags byte followed by two encoded heap types. Since these instructions have been in flux for a while, do not attempt to maintain backward compatibility with older versions of the instructions. Instead, upgrade all of the tests at once to use the new versions of the instructions. Drop some binary tests of deprecated instruction encodings that would be more effort to update than they're worth.
* [Strings] Adopt new instruction binary encoding (#5714)Jérôme Vouillon2023-05-121-70/+59
| | | | | | | | | | | See WebAssembly/stringref#46. This format is already adopted by V8: https://chromium-review.googlesource.com/c/v8/v8/+/3892695. The text format is left unchanged (see #5607 for a discussion on the subject). I have also added support for string.encode_lossy_utf8 and string.encode_lossy_utf8 array (by allowing the replace policy for Binaryen's string.encode_wtf8 instruction).
* [NFC] Refactor each of ArrayNewSeg and ArrayInit into subclasses for ↵Alon Zakai2023-05-041-20/+27
| | | | | | | | | | | Data/Elem (#5692) ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem ArrayInit => ArrayInitData, ArrayInitElem Basically we remove the opcode and use the class type to differentiate them. This adds some code but it makes the representation simpler and more compact in memory, and it will help with #5690
* Emit memory segment index for data segments (#5699)Alon Zakai2023-05-031-0/+9
| | | Before this fix we would flip all data segments to use the first memory.
* Fix name deduplication with partial names sections (#5689)Alon Zakai2023-04-281-0/+43
| | | | | | | | | | We already deduplicated names in the names section (to defend against a weird binary), but we also need to deduplicate the names of items not in the names section, so they don't overlap with the names that are. See example in the testcase. Normally wasm files use names for all items in each group. This only became noticeable in some wasm-ctor-eval work where new temp globals were added that were not given names.
* Remove the nominal type system (#5672)Thomas Lively2023-04-171-25/+7
| | | | | And since the only type system left is the standard isorecursive type system, remove `TypeSystem` and its associated APIs entirely. Delete a few tests that only made sense under the isorecursive type system.
* Implement array.fill, array.init_data, and array.init_elem (#5637)Thomas Lively2023-04-061-0/+50
| | | | | These complement array.copy, which we already supported, as an initial complete set of bulk array operations. Replace the WIP spec tests with the upstream spec tests, lightly edited for compatibility with Binaryen.
* Use Names instead of indices to identify segments (#5618)Thomas Lively2023-04-041-5/+50
| | | | | | | | | | All top-level Module elements are identified and referred to by Name, but for historical reasons element and data segments were referred to by index instead. Fix this inconsistency by using Names to refer to segments from expressions that use them. Also parse and print segment names like we do for other elements. The C API is partially converted to use names instead of indices, but there are still many functions that refer to data segments by index. Finishing the conversion can be done in the future once it becomes necessary.
* [NFC] Remove our bespoke `make_unique` implementation (#5613)Thomas Lively2023-03-311-3/+3
| | | | This code predates our adoption of C++14 and can now be removed in favor of `std::make_unique`, which should be more efficient.
* Ensure a deterministic order in the type names section (#5590)Alon Zakai2023-03-201-1/+1
| | | | | | | | | Before this PR we iterated over an unordered set. Replace that with an iteration on a vector. (Also, the value in the set was not even used, so this should even be faster.) Add random names in the fuzzer to types, the lack of which is I believe the reason this was not detected before.
* [Exceptions] Fix error on bad delegate index (#5587)Alon Zakai2023-03-171-1/+6
| | | | Fixes #5584
* Add bulk-array.wast spec test outline (#5568)Thomas Lively2023-03-161-3/+0
| | | | | | | | | Add spec/bulk-array.wast, which contains an outline of the tests that will be necessary for the upcoming bulk array instructions: array.copy (already implemented), array.fill, array.init_data, and array.init_elem. Although the test file does not actually contain any tests yet, it contains some setup code defining types, globals, and element segments that the tests will use. Fix miscellaneous bugs in parsing, validation, and printing to allow this setup code to run without issues.
* [Wasm GC] Remove RefIsFunc and RefIsI31 from the binary format (#5574)Alon Zakai2023-03-151-22/+0
| | | | We still support ref.is_func/i31 in the text format for now. After we verify that no users depend on that we can remove it as well.
* Parse and print `array.new_fixed` (#5527)Thomas Lively2023-02-281-1/+1
| | | | | | | | | This is a (more) standard name for `array.init_static`. (The full upstream name in the spec repo is `array.new_canon_fixed`, but I'm still hoping we can drop `canon` from all the instruction names and it doesn't appear elsewhere in Binaryen). Update all the existing tests to use the new name and add a test specifically to ensure the old name continues parsing.
* [NFC] Internally rename `ArrayInit` to `ArrayNewFixed` (#5526)Thomas Lively2023-02-281-4/+5
| | | | | | | | To match the standard instruction name, rename the expression class without changing any parsing or printing behavior. A follow-on PR will take care of the functional side of this change while keeping support for parsing the old name. This change will allow `ArrayInit` to be used as the expression class for the upcoming `array.init_data` and `array.init_elem` instructions.
* Fix sourcemap nesting in reading and writing (#5504)JesseCodeBones2023-02-241-12/+24
| | | | The stack logic was incorrect, and led to source locations being emitted on parents instead of children.
* [Strings] Add experimental string.hash instruction (#5480)Alon Zakai2023-02-031-0/+2
| | | See WebAssembly/stringref#60
* Fix issues with ref.cast_nop (#5473)Alon Zakai2023-02-031-2/+1
| | | | It did not have proper annotation for the safety field, and also it could not handle basic heap types.
* [Strings] Add experimental StringNew variants (#5459)Alon Zakai2023-01-261-4/+16
| | | | | | string.from_code_point makes a string from an int code point. string.new_utf8*_try makes a utf8 string and returns null on a UTF8 encoding error rather than trap.
* [Strings] Add string.compare (#5453)Alon Zakai2023-01-251-2/+7
| | | See WebAssembly/stringref#58
* [Wasm GC] Replace `HeapType::data` with `HeapType::struct_` (#5416)Thomas Lively2023-01-101-25/+15
| | | | | | `struct` has replaced `data` in the upstream spec, so update Binaryen's types to match. We had already supported `struct` as an alias for data, but now remove support for `data` entirely. Also remove instructions like `ref.is_data` that are deprecated and do not make sense without a `data` type.
* Represent ref.as_{func,data,i31} with RefCast (#5413)Thomas Lively2023-01-101-11/+26
| | | | | | | | | | | | | These operations are deprecated and directly representable as casts, so remove their opcodes in the internal IR and parse them as casts instead. For now, add logic to the printing and binary writing of RefCast to continue emitting the legacy instructions to minimize test changes. The few test changes necessary are because it is no longer valid to perform a ref.as_func on values outside the func type hierarchy now that ref.as_func is subject to the ref.cast validation rules. RefAsExternInternalize, RefAsExternExternalize, and RefAsNonNull are left unmodified. A future PR may remove RefAsNonNull as well, since it is also expressible with casts.
* Replace `RefIs` with `RefIsNull` (#5401)Thomas Lively2023-01-091-10/+14
| | | | | | | | | | | | | | | * Replace `RefIs` with `RefIsNull` The other `ref.is*` instructions are deprecated and expressible in terms of `ref.test`. Update binary and text parsing to parse those instructions as `RefTest` expressions. Also update the printing and emitting of `RefTest` expressions to emit the legacy instructions for now to minimize test changes and make this a mostly non-functional change. Since `ref.is_null` is the only `RefIs` instruction left, remove the `RefIsOp` field and rename the expression class to `RefIsNull`. The few test changes are due to the fact that `ref.is*` instructions are now subject to `ref.test` validation, and in particular it is no longer valid to perform a `ref.is_func` on a value outside of the `func` type hierarchy.
* Consolidate br_on* operations (#5399)Thomas Lively2023-01-061-16/+19
| | | | | | | | | | | | | | | | | | The `br_on{_non}_{data,i31,func}` operations are deprecated and directly representable in terms of the new `br_on_cast` and `br_on_cast_fail` instructions, so remove their dedicated IR opcodes in favor of representing them as casts. `br_on_null` and `br_on_non_null` cannot be consolidated the same way because their behavior is not directly representable in terms of `br_on_cast` and `br_on_cast_fail`; when the cast to null bottom type succeeds, the null check instructions implicitly drop the null value whereas the cast instructions would propagate it. Add special logic to the binary writer and printer to continue emitting the deprecated instructions for now. This will allow us to update the test suite in a separate future PR with no additional functional changes. Some tests are updated because the validator no longer allows passing non-func data to `br_on_func`. Doing so has not made sense since we separated the three reference type hierarchies.
* Support br_on_cast null (#5397)Thomas Lively2023-01-051-3/+13
| | | | | | | | | As well as br_on_cast_fail null. Unlike the existing br_on_cast* instructions, these new instructions treat the cast as succeeding when the input is a null. Update the internal representation of the cast type in `BrOn` expressions to be a `Type` rather than a `HeapType` so it will include nullability information. Also update and improve `RemoveUnusedBrs` to handle the new instructions correctly and optimize in more cases.
* Allow non-nullable ref.cast of nullable references (#5386)Thomas Lively2023-01-041-8/+0
| | | | | | | This new cast configuration was not expressible with the legacy cast instructions. Although it is valid in Wasm, do not allow nullable casts of non-nullable references, since those would unnecessarily lose type information. Convert such casts to be non-nullable during expression finalization.
* Support `ref.test null` (#5368)Thomas Lively2022-12-211-3/+6
| | | This new variant of ref.test returns 1 if the input is null.
* Update RefCast representation to drop extra HeapType (#5350)Thomas Lively2022-12-201-7/+13
| | | | | | | | | The latest upstream version of ref.cast is parameterized with a target reference type, not just a heap type, because the nullability of the result is parameterizable. As a first step toward implementing these new, more flexible ref.cast instructions, change the internal representation of ref.cast to use the expression type as the cast target rather than storing a separate heap type field. For now require that the encoded semantics match the previously allowed semantics, though, so that none of the optimization passes need to be updated.
* Use non-nullable ref.cast for non-nullable input (#5335)Thomas Lively2022-12-091-1/+11
| | | | | | | | | | | | We switched from emitting the legacy `ref.cast_static` instruction to emitting `ref.cast null` in #5331, but that wasn't quite correct. The legacy instruction had polymorphic typing so that its output type was nullable if and only if its input type was nullable. In contrast, `ref.cast null` always has a a nullable output type. Fix our output by instead emitting non-nullable `ref.cast` if the output should be non-nullable. Parse `ref.cast` in binary and text forms as well. Since the IR can only represent the legacy polymorphic semantics, disallow unsupported casts from nullable to non-nullable references or vice versa for now.