summaryrefslogtreecommitdiff
path: root/src/wasm/literal.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Limit printing of Literal[s] in a general way (#5792)Alon Zakai2023-06-281-16/+49
| | | | | | | | Previously we limited printing in a single Literals. But we can have infinitely recursive GC literals, or just huge graphs even without infinite recursion where no single Literals is that big (but we still get exponential blowup). This PR adds a general limit on how much we print once we start to print a Literal or Literals.
* Limit literal printing to a reasonable limit (#5779)Alon Zakai2023-06-211-0/+8
| | | | | | | | | Without this, a massive GC array for example can end up being printed as a huge amount of output, which is a problem in the fuzzer. I noticed this because a testcase was taking minutes to run. Perhaps we should add an option (BINARYEN_PRINT_FULL=1?) to disable this limit, but let's see if we ever need that.
* [Wasm GC] wasm-ctor-eval: Handle cycles of data (#5685)Alon Zakai2023-05-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | A cycle of data is something we can't just naively emit as wasm globals. If at runtime we end up, for example, with an object A that refers to itself, then we can't just emit (global $A (struct.new $A (global.get $A))) The struct.get is of this very global, and such a self-reference is invalid. So we need to break such cycles as we emit them. The simple idea used here is to find paths in the cycle that are nullable and mutable, and replace the initial value with a null that is fixed up later in the start function: (global $A (struct.new $A (ref.null $A))) (func $start (struct.set (global.get $A) (global.get $A))) ) This is not optimal in terms of breaking cycles, but it is fast (linear time) and simple, and does well in practice on j2wasm (where cycles in fact occur).
* Support interpretation of extern.externalize and extern.internalize (#5576)Thomas Lively2023-03-161-10/+50
| | | | | | | To allow the external and internal reference values to be differentiated yet round-trippable, set the `Literal` type to externref on external references, but keep the gcData the same for both. The only exception is for i31 references, for which the externalized version gets a `gcData` that contains a copy of the original i31 literal.
* [Strings] Add hashing and equality support for strings (#5507)Alon Zakai2023-02-211-0/+3
| | | This is enough to test RSE.
* [Strings] Add support for strings in getLiteral and Literal() (#5500)Alon Zakai2023-02-171-0/+10
| | | This is enough for DAE and other opts to run on string consts.
* [Strings] Initial string execution support (#5491)Alon Zakai2023-02-151-2/+16
| | | | | | | | | | Store string data as GC data. Inefficient (one Const per char), but ok for now. Implement string.new_wtf16 and string.const, enough for basic testing. Create strings in makeConstantExpression, which enables ctor-eval support. Print strings in fuzz-exec which makes testing easier.
* [Wasm GC] Replace `HeapType::data` with `HeapType::struct_` (#5416)Thomas Lively2023-01-101-2/+2
| | | | | | `struct` has replaced `data` in the upstream spec, so update Binaryen's types to match. We had already supported `struct` as an alias for data, but now remove support for `data` entirely. Also remove instructions like `ref.is_data` that are deprecated and do not make sense without a `data` type.
* Implement `array.new_data` and `array.new_elem` (#5214)Thomas Lively2022-11-071-0/+51
| | | | | | | | | In order to test them, fix the binary and text parsers to accept passive data segments even if a module has no memory. In addition to parsing and emitting the new instructions, also implement their validation and interpretation. Test the interpretation directly with wasm-shell tests adapted from the upstream spec tests. Running the upstream spec tests directly would require fixing too many bugs in the legacy text parser, so it will have to wait for the new text parser to be ready.
* Implement `array` basic heap type (#5148)Thomas Lively2022-10-181-0/+2
| | | | | | | | | `array` is the supertype of all defined array types and for now is a subtype of `data`. (Once `data` becomes `struct` this will no longer be true.) Update the binary and text parsing of `array.len` to ignore the obsolete type annotation and update the binary emitting to emit a zero in place of the old type annotation and the text printing to print an arbitrary heap type for the annotation. A follow-on PR will add support for the newer unannotated version of `array.len`.
* Implement bottom heap types (#5115)Thomas Lively2022-10-071-70/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.
* Refactor standardizeNaN (#5064)Max Graey2022-10-041-34/+32
| | | Change `standardizeNaN` to take a `Literal` to reduce usage verbosity.
* [OptimizeInstructions] Simplify floating point ops with NaN on right side ↵Max Graey2022-09-121-27/+9
| | | | | | | | | | | | | | | | | | | (#4985) x + nan -> nan' x - nan -> nan' x * nan -> nan' x / nan -> nan' min(x, nan) -> nan' max(x, nan) -> nan' where nan' is canonicalized nan of rhs x != nan -> 1 x == nan -> 0 x >= nan -> 0 x <= nan -> 0 x > nan -> 0 x < nan -> 0
* Restore the `extern` heap type (#4898)Thomas Lively2022-08-171-0/+5
| | | | | | | The GC proposal has split `any` and `extern` back into two separate types, so reintroduce `HeapType::ext` to represent `extern`. Before it was originally removed in #4633, externref was a subtype of anyref, but now it is not. Now that we have separate heaptype type hierarchies, make `HeapType::getLeastUpperBound` fallible as well.
* Remove RTTs (#4848)Thomas Lively2022-08-051-73/+1
| | | | | | | RTTs were removed from the GC spec and if they are added back in in the future, they will be heap types rather than value types as in our implementation. Updating our implementation to have RTTs be heap types would have been more work than deleting them for questionable benefit since we don't know how long it will be before they are specced again.
* Update reference type Literal constructors to use HeapType (#4857)Thomas Lively2022-08-011-4/+2
| | | | | | We already require non-null literals to have non-null types, but with this change we can enforce that constraint by construction. Also remove the default behavior of creating a function reference literal with heap type `func`, since there is always a more specific function type to use.
* [Wasm GC] Properly represent nulls in i31 (#4819)Alon Zakai2022-07-251-6/+6
| | | | | The encoding here is simple: we store i31 values in the literal.i32 field. The top bit says if a value exists, which means literal.i32 == 0 is the same as null.
* Remove basic reference types (#4802)Thomas Lively2022-07-201-82/+16
| | | | | | | | | Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to accidentally forget to handle reference types with the same basic HeapTypes but the opposite nullability. In principle there is nothing special about the types with shorthands except in the binary and text formats. Removing these shorthands from the internal type representation by removing all basic reference types makes some code more complicated locally, but simplifies code globally and encourages properly handling both nullable and non-nullable reference types.
* [Strings] Add string proposal types (#4755)Alon Zakai2022-06-291-0/+8
| | | | | | | | This starts to implement the Wasm Strings proposal https://github.com/WebAssembly/stringref/blob/main/proposals/stringref/Overview.md This just adds the types.
* [NFC] Make Literal::makeNull take a HeapType (#4664)Alon Zakai2022-05-131-1/+1
| | | | | | | | Taking a Type is redundant as we only care about the heap type - the nullability must be Nullable. This avoids needing an assertion in the function, that is, it makes the API more type-safe.
* Remove externref (#4633)Thomas Lively2022-05-041-19/+1
| | | | | | Remove `Type::externref` and `HeapType::ext` and replace them with uses of anyref and any, respectively, now that we have unified these types in the GC proposal. For backwards compatibility, continue to parse `extern` and `externref` and maintain their relevant C API functions.
* Implement relaxed SIMD dot product instructions (#4586)Thomas Lively2022-04-111-7/+23
| | | As proposed in https://github.com/WebAssembly/relaxed-simd/issues/52.
* Optimize Literal constructors and destructor (#4456)Alon Zakai2022-01-141-47/+63
| | | | | | | Handle the isBasic() case first - that inlined function is very fast to call, and it is the common case. Also, do not do unnecessary work there: just write out what we need, instead of always doing a memcpy of 16 bytes. This makes us over 2x faster on the benchmark in #4452
* Add support for relaxed-simd instructions (#4320)Ng Zhi An2021-11-151-0/+62
| | | | | | | | | | | | | | | | | | | | | This adds relaxed-simd instructions based on the current status of the proposal https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md. Binary opcodes are based on what is listed in https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#binary-format. Text names are not fixed yet, and some sort sort of names that maps to the non-relaxed versions are chosen for this prototype. Support for these instructions have been added to LLVM via builtins, adding support here will allow Emscripten to successfully compile files that use those builtins. Interpreter support has also been added, and they delegate to the non-relaxed versions of the instructions. Most instructions are implemented in the interpreter the same way as the non-relaxed simd128 instructions, except for fma/fms, which is always fused.
* Fix RTTs for RTT-less instructions (#4294)Thomas Lively2021-11-031-6/+15
| | | | | | | | | | | | Allocation and cast instructions without explicit RTTs should use the canonical RTTs for the given types. Furthermore, the RTTs for nominal types should reflect the static type hierarchy. Previously, however, we implemented allocations and casts without RTTs using an alternative system that only used static types rather than RTT values. This alternative system would work fine in a world without first-class RTTs, but it did not properly allow mixing instructions that use RTTs and instructions that do not use RTTs as intended by the M4 GC spec. This PR fixes the issue by using canonical RTTs where appropriate and cleans up the relevant casting code using std::variant.
* [NFC] Use std::variant in GCData (#4289)Thomas Lively2021-10-281-1/+3
| | | | This helps prevent bugs where we assume that the GCData has either a HeapType or Rtt without checking. Indeed, one such bug is found and fixed.
* [Wasm GC] Nulls compare equal regardless of type (#4094)Alon Zakai2021-08-191-3/+6
|
* [Simd] Implement extra convert, trunc, demote and promote ops for ↵Max Graey2021-07-281-0/+48
| | | | interpreter (#4023)
* [Simd] Refactoring. Remove middle *Vec* from some simd ops for consistency ↵Max Graey2021-07-271-17/+17
| | | | (#4027)
* [Simd] Add extending pairwise adds to interpreter (#4022)Max Graey2021-07-261-0/+25
|
* [SIMD] Add extend + mul simd operations to interpreter (#4021)Max Graey2021-07-261-12/+25
|
* [Simd] Add extension from i32x4 to i64x2 ops to interpreter (#4016)Max Graey2021-07-261-0/+12
|
* [SIMD] Refactor extend helper (#4018)Max Graey2021-07-231-13/+12
|
* Implement q15MulrSatSI16x8 for interpreter (#3984)Max Graey2021-07-141-1/+10
|
* Implement interpretation of i64x2.bitmask (#3982)Thomas Lively2021-07-131-0/+10
| | | | | | | Like a few other SIMD operations, this i64x2.bitmask had not been implemented in the interpreter yet. Unlike the others, i64x2.bitmask has type i32 rather than type v128, so Precompute was not skipping it, leading to a crash, as in https://github.com/emscripten-core/emscripten/issues/14629. Fix the problem by implementing i64x2.bitmask in the interpreter.
* [Wasm GC] Add a isNonNullable() convenience method. NFC (#3978)Alon Zakai2021-07-121-1/+1
|
* [Wasm GC] rtt.fresh_sub (#3936)Alon Zakai2021-06-171-3/+6
| | | | | | | | | | This is the same as rtt.sub, but creates a "new" rtt each time. See https://docs.google.com/document/d/1DklC3qVuOdLHSXB5UXghM_syCh-4cMinQ50ICiXnK3Q/edit# The old Literal implementation of rtts becomes a little more complex here, as it was designed for the original spec where only structure matters. It may be worth a complete redesign there, but for now as the spec is in flux I think the approach here is good enough.
* [Wasm GC] Fix precompute on GC data (#3810)Alon Zakai2021-04-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes precomputation on GC after #3803 was too optimistic. The issue is subtle. Precompute will repeatedly evaluate expressions and propagate their values, flowing them around, and it ignores side effects when doing so. For example: (block ..side effect.. (i32.const 1) ) When we evaluate that we see there are side effects, but regardless of them we know the value flowing out is 1. So we can propagate that value, if it is assigned to a local and read elsewhere. This is not valid for GC because struct.new and array.new have a "side effect" that is noticeable in the result. Each time we call struct.new we get a new struct with a new address, which ref.eq can distinguish. So when this pass evaluates the same thing multiple times it will get a different result. Also, we can't precompute a struct.get even if we know the struct, not unless we know the reference has not escaped (where a call could modify it). To avoid all that, do not precompute references, aside from the trivially safe ones like nulls and function references (simple constants that are the same each time we evaluate the expression emitting them). precomputeExpression() had a minor bug which this fixes. It checked the type of the expression to see if we can create a constant for it, but really it should check the value - since (separate from this PR) we have no way to emit a "constant" for a struct etc. Also that only matters if replaceExpression is true, that is, if we are replacing with a constant; if we just want the value internally, we have no limit on that. Also add Literal support for comparing GC refs, which is used by ref.eq. Without that tiny fix the tests here crash. This adds a bunch of tests, many for corner cases that we don't handle (since the PR makes us not propagate GC references). But they should be helpful if/when we do, to avoid the mistakes in #3803
* Update SIMD names and opcodes (#3771)Thomas Lively2021-04-051-46/+47
| | | | Also removes experimental SIMD instructions that were not included in the final spec proposal.
* [GC] isGCData => isData (#3534)Alon Zakai2021-02-011-6/+6
| | | | | | | | | We added isGCData() before we had dataref. But now there is a clear parallel of Function vs Data. This PR makes us more consistent there, renaming isGCData to isData and using that throughout. This also fixes a bug where the old isGCData just checked if the input was an Array or a Struct, and ignored the data heap type itself. It is not possible to test that, however, due to other bugs, so that is deferred.
* Reorder i31ref and dataref (#3509)Heejin Ahn2021-01-231-13/+13
| | | | | | | | | | The binary spec (https://docs.google.com/document/d/1yAWU3dbs8kUa_wcnnirDxUu9nEBsNfq0Xo90OWx6yuo/edit#) lists `dataref` after `i31ref`, and `dataref` also comes after `i31ref` in its binary code in the value-increasing order. This reorders these two in wasm-type.h and other places, although in most of those places the order is irrelevant. This also adds C and JS API for `dataref`.
* Remove exnref and br_on_exn (#3505)Heejin Ahn2021-01-221-48/+2
| | | This removes `exnref` type and `br_on_exn` instruction.
* [GC] Add dataref type (#3500)Alon Zakai2021-01-211-1/+15
| | | | | This is not 100% of everything, but is enough to get tests passing, which includes full binary and text format support, getting all switches to compile without error, and some additions to InstrumentLocals.
* [GC] More HeapType instead of Type (#3475)Alon Zakai2021-01-111-54/+91
| | | | | | | | To handle both nullable and non-nullable i31s and other heap types, we cannot just look at the isBasic case (which is just one of the two). This may fix this issue on the release builder: https://github.com/WebAssembly/binaryen/runs/1669944081?check_suite_focus=true but the issue does not reproduce locally, so I worry it is something else...
* Add more verbose logging of an assert that only happens on the release ↵Alon Zakai2021-01-081-1/+6
| | | | | builder for some inexplicable reason (#3477) See #3459
* Prototype SIMD extending pairwise add instructions (#3466)Thomas Lively2021-01-051-6/+0
| | | | | | As proposed in https://github.com/WebAssembly/simd/pull/380, using the opcodes used in LLVM and V8. Since these opcodes overlap with the opcodes of i64x2.all_true and i64x2.any_true, which have long since been removed from the SIMD proposal, this PR also removes those instructions.
* [GC] Fully implement RTT semantics (#3441)Alon Zakai2020-12-151-7/+79
| | | | | | | | | | | | | | This adds info to RTT literals so that they can represent the chain of rtt.canon/sub commands that generated them, and it adds an internal RTT for each GC allocation (array or struct). The approach taken is to simply store the full chain of rtt.sub types that led to each literal. This is not efficient, but it is simple and seems sufficient for the semantics described in the GC MVP doc - specifically, only the types matter, in that repeated executions of rtt.canon/sub on the same inputs yield equal outputs. This PR fixes a bunch of minor issues regarding that, enough to allow testing of the optimization and execution of ref.test/cast.
* Prototype SIMD instructions implemented in LLVM (#3440)Thomas Lively2020-12-111-0/+4
| | | | | | - i64x2.eq (https://github.com/WebAssembly/simd/pull/381) - i64x2 widens (https://github.com/WebAssembly/simd/pull/290) - i64x2.bitmask (https://github.com/WebAssembly/simd/pull/368) - signselect ops (https://github.com/WebAssembly/simd/pull/124)
* [GC] Add basic RTT support (#3432)Alon Zakai2020-12-081-0/+3
| | | | | | | | | | | | | | | | This adds rtt.canon and rtt.sub together with RTT type support that is necessary for them. Together this lets us test roundtripping the instructions and types. Also fixes a missing traversal over globals in collectHeapTypes, which the example from the GC docs requires, as the RTTs are in globals there. This does not yet add full interpreter support and other things. It disables initial contents on GC in the fuzzer, to avoid the fuzzer breaking. Renames the binary ID for exnref, which is being removed from the spec, and which overlaps with the binary ID for rtt.
* [GC] Add struct.get instruction parsing and execution (#3429)Alon Zakai2020-12-071-0/+15
| | | | | | | | | | | | | | | | | | | | This is the first instruction that uses a GC Struct or Array, so it's where we start to actually need support in the interpreter for those values, which is added here. GC data is modeled as a gcData field on a Literal, which is just a Literals. That is, both a struct and an array are represented as an array of values. The type which is alongside would indicate if it's a struct or an array. Note that the data is referred to using a shared_ptr so it should "just work", but we'll only be able to really test that once we add struct.new and so can verify that references are by reference and not value, etc. As the first instruction to care about i8/16 types (which are only possible in a Struct or Array) this adds support for parsing and emitting them. This PR includes fuzz fixes for some minor things the fuzzer found, including some bad printing of not having ResultTypeName in necessary places (found by the text format roundtripping fuzzer).