summaryrefslogtreecommitdiff
path: root/src/wasm-interpreter.h
Commit message (Collapse)AuthorAgeFilesLines
...
* Implement dropping of active Element Segments (#6343)Alon Zakai2024-02-231-10/+17
| | | | Also rename the existing droppedSegments to droppedDataSegments for clarity.
* Typed continuations: cont.new instructions (#6308)Frank Emrich2024-02-221-0/+2
| | | | | | | | | | | | | | | | | This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `cont.new` instruction for creating continuations, documented [here(https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). In short, these instructions are of the form `(cont.new $ct)` where `$ct` must be a continuation type. The instruction takes a single (nullable) function reference as its argument, which means that the folded representation of the instruction is of the form `(cont.new $ct (foo ...))`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction.
* Strings: Add some interpreter support (#6304)Alon Zakai2024-02-141-3/+55
| | | | | | | This adds just enough support to be able to --fuzz-exec a small but realistic fuzz testcase from Java. To that end, just implement the minimal ops we need, which are all related to JS-style strings.
* Typed continuations: resume instructions (#6083)Frank Emrich2024-01-111-0/+2
| | | | | This PR is part of a series that adds basic support for the [typed continuations proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `resume` instruction. The most notable missing feature is validation, which is not implemented, yet.
* [NFC] Add some const annotations (#6203)Alon Zakai2024-01-051-1/+1
|
* [EH] Add instructions for new proposal (#6181)Heejin Ahn2023-12-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | This adds basic support for the new instructions in the new EH proposal passed at the Oct CG hybrid CG meeting: https://github.com/WebAssembly/meetings/blob/main/main/2023/CG-10.md https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md This mainly adds two instructions: `try_table` and `throw_ref`. This is the bare minimum required to read and write text and binary format, and does not include analyses or optimizations. (It includes some analysis required for validation of existing instructions.) Validation for the new instructions is not yet included. `try_table` faces the same problem with the `resume` instruction in #6083 that without the module-level tag info, we are unable to know the 'sent types' of `try_table`. This solves it with a similar approach taken in #6083: this adds `Module*` parameter to `finalize` methods, which defaults to `nullptr` when not given. The `Module*` parameter is given when called from the binary and text parser, and we cache those tag types in `sentTypes` array within `TryTable` class. In later optimization passes, as long as they don't touch tags, it is fine to call `finalize` without the `Module*`. Refer to https://github.com/WebAssembly/binaryen/pull/6083#issuecomment-1854634679 and #6096 for related discussions when `resume` was added.
* Implement table.copy (#6078)Alon Zakai2023-11-061-0/+56
| | | Helps #5951
* Fix handling of exported imported functions (#6044)Alon Zakai2023-10-241-1/+7
| | | | | | | | Two trivial places did not handle that case, and assumed an exported function was actually defined (and not imported). Also add some const stuff to fix compilation after this change. This was discovered by #6026
* Implement table.fill (#5949)Thomas Lively2023-09-181-0/+37
| | | | | | | | This instruction was standardized as part of the bulk memory proposal, but we never implemented it until now. Leave similar instructions like table.copy as future work. Fixes #5939.
* Replace I31New with RefI31 everywhere (#5930)Thomas Lively2023-09-131-2/+2
| | | | | | | | Globally replace the source string "I31New" with "RefI31" in preparation for renaming the instruction from "i31.new" to "ref.i31", as implemented in the spec in https://github.com/WebAssembly/gc/pull/422. This would be NFC, except that it also changes the string in the external-facing C APIs. A follow-up PR will make the corresponding behavioral change.
* Fix pop assertion (#5777)Alon Zakai2023-06-201-1/+1
| | | Subtypes are allowed as well, not just exact matches, in the pop value's type.
* [NFC] Optimize ArrayNew zero construction (#5722)Alon Zakai2023-05-151-1/+2
| | | | | | | | | | All array elements have the same type, so we can construct a single zero and just copy it. This makes ArrayNew of large arrays 2x faster. I also experimented with putting Literal::makeZero in a header, in hopes of inlining leading to licm helping here, but that did not help at all unfortunately, at least not in gcc.
* [NFC] Refactor each of ArrayNewSeg and ArrayInit into subclasses for ↵Alon Zakai2023-05-041-80/+121
| | | | | | | | | | | Data/Elem (#5692) ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem ArrayInit => ArrayInitData, ArrayInitElem Basically we remove the opcode and use the class type to differentiate them. This adds some code but it makes the representation simpler and more compact in memory, and it will help with #5690
* [Wasm GC] Ignore GC cycle leaks in LSan (#5686)Alon Zakai2023-04-241-9/+26
| | | | | Leaks happen since we use std::shared_ptr which does not handle cycles. But since Binaryen isn't used in long-running code it's probably find to just let them leak, and ignore them in LSan, for now.
* Implement array.fill, array.init_data, and array.init_elem (#5637)Thomas Lively2023-04-061-8/+128
| | | | | These complement array.copy, which we already supported, as an initial complete set of bulk array operations. Replace the WIP spec tests with the upstream spec tests, lightly edited for compatibility with Binaryen.
* Use Names instead of indices to identify segments (#5618)Thomas Lively2023-04-041-9/+6
| | | | | | | | | | All top-level Module elements are identified and referred to by Name, but for historical reasons element and data segments were referred to by index instead. Fix this inconsistency by using Names to refer to segments from expressions that use them. Also parse and print segment names like we do for other elements. The C API is partially converted to use names instead of indices, but there are still many functions that refer to data segments by index. Finishing the conversion can be done in the future once it becomes necessary.
* Support interpretation of extern.externalize and extern.internalize (#5576)Thomas Lively2023-03-161-5/+5
| | | | | | | To allow the external and internal reference values to be differentiated yet round-trippable, set the `Literal` type to externref on external references, but keep the gcData the same for both. The only exception is for i31 references, for which the externalized version gets a `gcData` that contains a copy of the original i31 literal.
* [Wasm GC] Properly handle packed field truncation in StructNew (#5570)Alon Zakai2023-03-131-2/+3
|
* [NFC] Internally rename `ArrayInit` to `ArrayNewFixed` (#5526)Thomas Lively2023-02-281-2/+2
| | | | | | | | To match the standard instruction name, rename the expression class without changing any parsing or printing behavior. A follow-on PR will take care of the functional side of this change while keeping support for parsing the old name. This change will allow `ArrayInit` to be used as the expression class for the upcoming `array.init_data` and `array.init_elem` instructions.
* [Strings] Interpret string.eq and string.compare (#5501)Alon Zakai2023-02-171-1/+66
|
* [Strings] Add support for strings in getLiteral and Literal() (#5500)Alon Zakai2023-02-171-6/+1
| | | This is enough for DAE and other opts to run on string consts.
* [Strings] Initial string execution support (#5491)Alon Zakai2023-02-151-4/+48
| | | | | | | | | | Store string data as GC data. Inefficient (one Const per char), but ok for now. Implement string.new_wtf16 and string.const, enough for basic testing. Create strings in makeConstantExpression, which enables ctor-eval support. Print strings in fuzz-exec which makes testing easier.
* [Wasm GC] Fix array.new order of operand execution (#5487)Alon Zakai2023-02-141-4/+7
|
* Represent ref.as_{func,data,i31} with RefCast (#5413)Thomas Lively2023-01-101-19/+2
| | | | | | | | | | | | | These operations are deprecated and directly representable as casts, so remove their opcodes in the internal IR and parse them as casts instead. For now, add logic to the printing and binary writing of RefCast to continue emitting the legacy instructions to minimize test changes. The few test changes necessary are because it is no longer valid to perform a ref.as_func on values outside the func type hierarchy now that ref.as_func is subject to the ref.cast validation rules. RefAsExternInternalize, RefAsExternExternalize, and RefAsNonNull are left unmodified. A future PR may remove RefAsNonNull as well, since it is also expressible with casts.
* Replace `RefIs` with `RefIsNull` (#5401)Thomas Lively2023-01-091-14/+3
| | | | | | | | | | | | | | | * Replace `RefIs` with `RefIsNull` The other `ref.is*` instructions are deprecated and expressible in terms of `ref.test`. Update binary and text parsing to parse those instructions as `RefTest` expressions. Also update the printing and emitting of `RefTest` expressions to emit the legacy instructions for now to minimize test changes and make this a mostly non-functional change. Since `ref.is_null` is the only `RefIs` instruction left, remove the `RefIsOp` field and rename the expression class to `RefIsNull`. The few test changes are due to the fact that `ref.is*` instructions are now subject to `ref.test` validation, and in particular it is no longer valid to perform a `ref.is_func` on a value outside of the `func` type hierarchy.
* Consolidate br_on* operations (#5399)Thomas Lively2023-01-061-46/+4
| | | | | | | | | | | | | | | | | | The `br_on{_non}_{data,i31,func}` operations are deprecated and directly representable in terms of the new `br_on_cast` and `br_on_cast_fail` instructions, so remove their dedicated IR opcodes in favor of representing them as casts. `br_on_null` and `br_on_non_null` cannot be consolidated the same way because their behavior is not directly representable in terms of `br_on_cast` and `br_on_cast_fail`; when the cast to null bottom type succeeds, the null check instructions implicitly drop the null value whereas the cast instructions would propagate it. Add special logic to the binary writer and printer to continue emitting the deprecated instructions for now. This will allow us to update the test suite in a separate future PR with no additional functional changes. Some tests are updated because the validator no longer allows passing non-func data to `br_on_func`. Doing so has not made sense since we separated the three reference type hierarchies.
* Update RefCast representation to drop extra HeapType (#5350)Thomas Lively2022-12-201-29/+13
| | | | | | | | | The latest upstream version of ref.cast is parameterized with a target reference type, not just a heap type, because the nullability of the result is parameterizable. As a first step toward implementing these new, more flexible ref.cast instructions, change the internal representation of ref.cast to use the expression type as the cast target rather than storing a separate heap type field. For now require that the encoded semantics match the previously allowed semantics, though, so that none of the optimization passes need to be updated.
* Use C++17's [[maybe_unused]]. NFC (#5309)Sam Clegg2022-12-021-2/+1
|
* Switch from `typedef` to `using` in C++ code. NFC (#5258)Sam Clegg2022-11-151-1/+1
| | | | This is more modern and (IMHO) easier to read than that old C typedef syntax.
* Fix arithmetic in interpretation of ArrayNewSeg (#5251)Thomas Lively2022-11-141-4/+4
| | | | | | | | | | | The offset and size were previously being sign extended from 32 to 64 bits, which meant that negative sizes could make the bounds check pass and cause an exception to be thrown by an overly large allocation. Switch to using uint64_t from the start rather than mixing sizes and signs, and update the tests to reproduce the error more robustly in the absence of the fix. Also fix a bug in RemoveUnusedModuleElements triggered by the new test. Fixes #5249.
* Fix two fuzz bugs with ArrayNewSeg (#5242)Thomas Lively2022-11-111-1/+2
| | | | | | | | | | | | First, we forgot to note the type annotation on `ArrayNewSeg` instructions, so in small modules where these are the only annotated instructions, the type section would be incomplete. Second, in the interpreter we were reserving space for the array before checking that the segment access was valid. This could cause huge allocations that threw bad_alloc exceptions before the interpreter could get around to trapping. Fix the problem by reserving the array after validating the arguements. Fixes #5236.
* [NFC] Fix unused variable warning (#5231)walkingeyerobot2022-11-081-0/+1
|
* Implement `array.new_data` and `array.new_elem` (#5214)Thomas Lively2022-11-071-0/+62
| | | | | | | | | In order to test them, fix the binary and text parsers to accept passive data segments even if a module has no memory. In addition to parsing and emitting the new instructions, also implement their validation and interpretation. Test the interpretation directly with wasm-shell tests adapted from the upstream spec tests. Running the upstream spec tests directly would require fixing too many bugs in the legacy text parser, so it will have to wait for the new text parser to be ready.
* Make `Name` a pointer, length pair (#5122)Thomas Lively2022-10-111-5/+3
| | | | | | | | | | | | | | | | | | | | | | | With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char*`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char*` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char*` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char*` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
* Implement bottom heap types (#5115)Thomas Lively2022-10-071-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.
* Fix ordering of visit() in MemoryGrow interpretation (#5108)Alon Zakai2022-10-031-4/+4
| | | | | | | This is a pretty subtle point that was missed in #4811 - we need to first visit the child, then compute the size, as the child may alter that size. Found by the fuzzer.
* Implement `extern.externalize` and `extern.internalize` (#4975)Thomas Lively2022-08-291-0/+10
| | | | These new GC instructions infallibly convert between `extern` and `any` references now that those types are not in the same hierarchy.
* Mutli-Memories Support in IR (#4811)Ashley Nelson2022-08-171-148/+265
| | | | | | | This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction. It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format. There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
* Remove RTTs (#4848)Thomas Lively2022-08-051-90/+7
| | | | | | | RTTs were removed from the GC spec and if they are added back in in the future, they will be heap types rather than value types as in our implementation. Updating our implementation to have RTTs be heap types would have been more work than deleting them for questionable benefit since we don't know how long it will be before they are specced again.
* Update reference type Literal constructors to use HeapType (#4857)Thomas Lively2022-08-011-5/+8
| | | | | | We already require non-null literals to have non-null types, but with this change we can enforce that constraint by construction. Also remove the default behavior of creating a function reference literal with heap type `func`, since there is always a more specific function type to use.
* Add interpreter support for intrinsics (#4851)Alon Zakai2022-08-011-1/+9
| | | This can give us some chance to catch bugs like #4839 in the fuzzer.
* [Strings] Add interpreter stubs for string instructions (#4835)Alon Zakai2022-07-261-35/+40
| | | | | | | | | The stubs let precompute skip over them without erroring. With this PR we can run the optimizer on strings code. We still can't run --fuzz-exec though, so we can't run the fuzzer. Also simplify the error strings in the earlier part of the file. All other code just has "unimp" so we might as well do the same and not mention full names there.
* [Wasm GC] Properly represent nulls in i31 (#4819)Alon Zakai2022-07-251-0/+3
| | | | | The encoding here is simple: we store i31 values in the literal.i32 field. The top bit says if a value exists, which means literal.i32 == 0 is the same as null.
* Remove basic reference types (#4802)Thomas Lively2022-07-201-10/+0
| | | | | | | | | Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to accidentally forget to handle reference types with the same basic HeapTypes but the opposite nullability. In principle there is nothing special about the types with shorthands except in the binary and text formats. Removing these shorthands from the internal type representation by removing all basic reference types makes some code more complicated locally, but simplifies code globally and encourages properly handling both nullable and non-nullable reference types.
* [Strings] stringview_*.slice (#4805)Alon Zakai2022-07-151-0/+6
| | | | | | | Unfortunately one slice is the same as python [start:end], using 2 params, and the other slice is one param, [CURR:CURR+num] (where CURR is implied by the current state in the iter). So we can't use a single class here. Perhaps a different name would be good, like slice vs substring (like JS does), but I picked names to match the current spec.
* [Strings] stringview access operations (#4798)Alon Zakai2022-07-131-0/+12
|
* [Strings] string.as (#4797)Alon Zakai2022-07-121-0/+3
|
* [Strings] string.eq (#4781)Alon Zakai2022-07-081-0/+3
|
* [Strings] string.concat (#4777)Alon Zakai2022-07-081-0/+3
|
* [Strings] string.encode (#4776)Alon Zakai2022-07-071-0/+3
|