summaryrefslogtreecommitdiff
path: root/src/wasm/wasm-validator.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Remove incorrect validation of segment sizes (#6228)Alon Zakai2024-01-221-9/+0
| | | | This should be a runtime error, not a validator error. It caused a fuzzer failure on wasm-ctor-eval.
* [NFC] Fix "initialised" => "initialized" (#6222)Thomas Lively2024-01-111-1/+1
|
* Typed continuations: resume instructions (#6083)Frank Emrich2024-01-111-0/+19
| | | | | This PR is part of a series that adds basic support for the [typed continuations proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `resume` instruction. The most notable missing feature is validation, which is not implemented, yet.
* [EH] Add validation for new instructions (#6185)Heejin Ahn2023-12-201-5/+78
| | | | | | | | | | This adds validation for the new EH instructions (`try_table` and `throw_ref`): https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md This also adds a spec test for checking invalid modules. We cannot check the executions yet because we don't have the interpreter implementation. The new test file also contains tests for the existing `throw`, because this is meant to replace the old spec test someday.
* Add tuple.drop validation (#6186)Alon Zakai2023-12-191-0/+5
| | | | | Without this fuzzer testcases fail if the initial content has a tuple.drop but multivalue is disabled (then the initial content validates erroneously, and that content is remixed into more content using multivalue which fails to validate).
* [EH] Add instructions for new proposal (#6181)Heejin Ahn2023-12-191-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | This adds basic support for the new instructions in the new EH proposal passed at the Oct CG hybrid CG meeting: https://github.com/WebAssembly/meetings/blob/main/main/2023/CG-10.md https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md This mainly adds two instructions: `try_table` and `throw_ref`. This is the bare minimum required to read and write text and binary format, and does not include analyses or optimizations. (It includes some analysis required for validation of existing instructions.) Validation for the new instructions is not yet included. `try_table` faces the same problem with the `resume` instruction in #6083 that without the module-level tag info, we are unable to know the 'sent types' of `try_table`. This solves it with a similar approach taken in #6083: this adds `Module*` parameter to `finalize` methods, which defaults to `nullptr` when not given. The `Module*` parameter is given when called from the binary and text parser, and we cache those tag types in `sentTypes` array within `TryTable` class. In later optimization passes, as long as they don't touch tags, it is fine to call `finalize` without the `Module*`. Refer to https://github.com/WebAssembly/binaryen/pull/6083#issuecomment-1854634679 and #6096 for related discussions when `resume` was added.
* Implement more TypeGeneralizing transfer functions (#6118)Thomas Lively2023-11-151-12/+14
| | | | | | | Finish the transfer functions for all expressions except for string instructions, exception handling instructions, tuple instructions, and branch instructions that carry values. The latter require more work in the CFG builder because dropping the extra stack values happens after the branch but before the target block.
* Implement table.copy (#6078)Alon Zakai2023-11-061-0/+22
| | | Helps #5951
* Allow rec groups of public function types in closed world (#6053)Alon Zakai2023-10-261-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Closed-world mode allows function types to escape if they are on exported functions, because that has been possible since wasm MVP and cannot be avoided. But we need to also allow all types in those type's rec groups as well. Consider this case: (module (rec (type $0 (func)) (type $1 (func)) ) (func "0" (type $0) (nop) ) (func "1" (type $1) (nop) ) ) The two exported functions make the two types public, so this module validates in closed world mode. Now imagine that metadce removes one export: (module (rec (type $0 (func)) (type $1 (func)) ) (func "0" (type $0) (nop) ) ;; The export "1" is gone. ) Before this PR that no longer validates, because it only marks the type $0 as public. But when a type is public that makes its entire rec group public, so $1 is errored on. To fix that, this PR allows all types in a rec group of an exported function's type, which makes that last module validate.
* Fix segfault in catch validator (#6032)Philip Blair2023-10-231-26/+24
| | | | | The problem was if you construct a try expression which references a nonexistent tag in one of its catch blocks, the validation code successfully identified the null pointer but then proceeded to try to read from it.
* [typed-cont] Allow result types on tags (#5997)Frank Emrich2023-10-051-4/+17
| | | | | | | | | | | This PR is part of a series that adds basic support for the typed continuations proposal. This PR relaxes the restriction that tags must not have results , only params. Tags with results must not be used for exception handling and are only allowed if the typed continuations feature is enabled. As a minor point, this PR also changes the printing of tags without params: To make the presentation consistent, (param) is omitted when printing a tag.
* [Parser] Parse labels and br (#5970)Thomas Lively2023-10-021-2/+5
| | | | | | The parser previously parsed labels and could attach them to control flow structures, but did not maintain the context necessary to correctly parse branches. Support parsing labels as both names and indices in IRBuilder, handling shadowing correctly, and use that support to implement parsing of br.
* Refine ref.test's castType during refinalization (#5985)Thomas Lively2023-10-021-0/+4
| | | | | | Just like we do with other casts, refine the cast type to be the greatest lower bound of its previous cast type and its input type. The difference is that the output type of ref.test remains i32, but it's still useful to retain more precise type information.
* Support i8/i16 mutable arrays as public types for string interop (#5814)Alon Zakai2023-09-211-1/+5
| | | | | Probably any array of non-reference data can be allowed to be public and sent out of the module, as it is just data. For now, however, just special case the i8 and i16 array types which are useful already for string interop.
* Fix validation error message for table.fill (#5953)Thomas Lively2023-09-181-4/+3
| | | table.fill requires bulk memory to be enabled, not reference types.
* Implement table.fill (#5949)Thomas Lively2023-09-181-0/+19
| | | | | | | | This instruction was standardized as part of the bulk memory proposal, but we never implemented it until now. Leave similar instructions like table.copy as future work. Fixes #5939.
* Replace i31.new with ref.i31 everywhere (#5931)Thomas Lively2023-09-131-2/+2
| | | | | Replace i31.new with ref.i31 in the printer, tests, and source code. Continue parsing i31.new for the time being to allow a graceful transition. Also update the JS API to reflect the new instruction name.
* Replace I31New with RefI31 everywhere (#5930)Thomas Lively2023-09-131-2/+2
| | | | | | | | Globally replace the source string "I31New" with "RefI31" in preparation for renaming the instruction from "i31.new" to "ref.i31", as implemented in the spec in https://github.com/WebAssembly/gc/pull/422. This would be NFC, except that it also changes the string in the external-facing C APIs. A follow-up PR will make the corresponding behavioral change.
* Remove the GCNNLocals feature (#5080)Thomas Lively2023-08-311-39/+7
| | | | | Now that the WasmGC spec has settled on a way of validating non-nullable locals, we no longer need this experimental feature that allowed nonstandard uses of non-nullable locals.
* Validate and fix up tuples with non-nullable elements (#5909)Thomas Lively2023-08-301-3/+6
| | | | | | The code validating and fixing up non-nullable locals previously did not correctly handle tuples that contained non-nullable elements, which could have resulted in invalid modules going undetected. Update the code to handle tuples and add tests.
* Rename multimemory flag (#5890)Ashley Nelson2023-08-211-2/+2
| | | Renaming the multimemory flag in Binaryen to match its naming in LLVM.
* Ensure br_on_cast* target type is subtype of input type (#5881)Thomas Lively2023-08-171-0/+5
| | | | | | | | | | | | | | | | The WasmGC spec will require that the target cast type of br_on_cast and br_on_cast_fail be a subtype of the input type, but so far Binaryen has not enforced this constraint, so it could produce invalid modules when optimizations refined the input to a br_on_cast* such that it was no longer a supertype of the cast target type. Fix this problem by setting the cast target type to be the greatest lower bound of the original cast target type and the current input type in `BrOn::finalize()`. This maintains the invariant that the cast target type should be a subtype of the input type and it also does not change cast behavior; any value that could make the original cast succeed at runtime necessarily inhabits both the original cast target type and the input type, so it also must inhabit their greatest lower bound and will make the updated cast succeed as well.
* [NFC] Simplify `Tuple` by making it an alias of `TypeList` (#5775)Thomas Lively2023-06-201-1/+1
| | | | | | | Rather than wrap a `TypeList`, make `Tuple` an alias of `TypeList`. This means removing `Tuple::toString`, but that had no callers and was of limited use for debugging anyway. In return, the use of tuples becomes much less verbose. In the future, it may make sense to remove one of `Tuple` and `TypeList`.
* Validate tag param types (#5759)Alon Zakai2023-06-081-0/+5
| | | | | We already validated function params, but were missing tags. Without this the fuzzer can get confused if a type is only used in a tag.
* Strings: Add initial validation checks (#5758)Alon Zakai2023-06-081-0/+91
| | | | | | | | | This is far from comprehensive, but it checks strings being enabled for all the instructions. Without this, the fuzzer can get confused because it checks if code validates and then proceeds under that assumption, so any missing validation checks can cause problems (specifically, if we have a string.const without strings enabled then we error during writing of the string, since we don't do the initial pass to find all strings to deduplicate them).
* [NFC] Refactor each of ArrayNewSeg and ArrayInit into subclasses for ↵Alon Zakai2023-05-041-55/+86
| | | | | | | | | | | Data/Elem (#5692) ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem ArrayInit => ArrayInitData, ArrayInitElem Basically we remove the opcode and use the class type to differentiate them. This adds some code but it makes the representation simpler and more compact in memory, and it will help with #5690
* Implement array.fill, array.init_data, and array.init_elem (#5637)Thomas Lively2023-04-061-12/+110
| | | | | These complement array.copy, which we already supported, as an initial complete set of bulk array operations. Replace the WIP spec tests with the upstream spec tests, lightly edited for compatibility with Binaryen.
* Support multiple memories in RemoveUnusedModuleElements (#5604)Thomas Lively2023-04-041-6/+1
| | | | | | | | Add support for memory and data segment module elements and treat them uniformly with other module elements rather than as special cases. There is a cyclic dependency between memories (or tables) and their active segments because exported or accessed memories (or tables) keep their active segments alive, but active segments for imported memories (or tables) keep their memories (or tables) alive as well.
* Use Names instead of indices to identify segments (#5618)Thomas Lively2023-04-041-9/+9
| | | | | | | | | | All top-level Module elements are identified and referred to by Name, but for historical reasons element and data segments were referred to by index instead. Fix this inconsistency by using Names to refer to segments from expressions that use them. Also parse and print segment names like we do for other elements. The C API is partially converted to use names instead of indices, but there are still many functions that refer to data segments by index. Finishing the conversion can be done in the future once it becomes necessary.
* [NFC] Remove our bespoke `make_unique` implementation (#5613)Thomas Lively2023-03-311-1/+1
| | | | This code predates our adoption of C++14 and can now be removed in favor of `std::make_unique`, which should be more efficient.
* Do not treat `atomic.fence` as using a memory (#5603)Thomas Lively2023-03-291-2/+0
| | | | | | | | | * Do not treat `atomic.fence` as using a memory Update RemoveUnusedModuleElements so that it no longer keeps the memory alive due to an `atomic.fence` instruction and update validation to allow modules to use `atomic.fence` without a memory. * update wasm2js tests
* Add bulk-array.wast spec test outline (#5568)Thomas Lively2023-03-161-18/+11
| | | | | | | | | Add spec/bulk-array.wast, which contains an outline of the tests that will be necessary for the upcoming bulk array instructions: array.copy (already implemented), array.fill, array.init_data, and array.init_elem. Although the test file does not actually contain any tests yet, it contains some setup code defining types, globals, and element segments that the tests will use. Fix miscellaneous bugs in parsing, validation, and printing to allow this setup code to run without issues.
* Make constant expression validation stricter (#5557)Thomas Lively2023-03-101-36/+29
| | | | | | | | | | Previously we treated global.get as a constant expression and only additionally verified that the target globals were immutable in some cases. But global.get of a mutable global is never a constant expression, and further, only imported globals are available in constant expressions unless GC is enabled. Fix constant expression validation to only allow global.get of immutable, imported globals, and fix all the invalid tests.
* [NFC] Internally rename `ArrayInit` to `ArrayNewFixed` (#5526)Thomas Lively2023-02-281-2/+2
| | | | | | | | To match the standard instruction name, rename the expression class without changing any parsing or printing behavior. A follow-on PR will take care of the functional side of this change while keeping support for parsing the old name. This change will allow `ArrayInit` to be used as the expression class for the upcoming `array.init_data` and `array.init_elem` instructions.
* Fix validation of DataDrop (#5517)Alon Zakai2023-02-231-3/+6
| | | Fixes #5511
* Replace `RefIs` with `RefIsNull` (#5401)Thomas Lively2023-01-091-6/+7
| | | | | | | | | | | | | | | * Replace `RefIs` with `RefIsNull` The other `ref.is*` instructions are deprecated and expressible in terms of `ref.test`. Update binary and text parsing to parse those instructions as `RefTest` expressions. Also update the printing and emitting of `RefTest` expressions to emit the legacy instructions for now to minimize test changes and make this a mostly non-functional change. Since `ref.is_null` is the only `RefIs` instruction left, remove the `RefIsOp` field and rename the expression class to `RefIsNull`. The few test changes are due to the fact that `ref.is*` instructions are now subject to `ref.test` validation, and in particular it is no longer valid to perform a `ref.is_func` on a value outside of the `func` type hierarchy.
* Support br_on_cast null (#5397)Thomas Lively2023-01-051-3/+8
| | | | | | | | | As well as br_on_cast_fail null. Unlike the existing br_on_cast* instructions, these new instructions treat the cast as succeeding when the input is a null. Update the internal representation of the cast type in `BrOn` expressions to be a `Type` rather than a `HeapType` so it will include nullability information. Also update and improve `RemoveUnusedBrs` to handle the new instructions correctly and optimize in more cases.
* Allow non-nullable ref.cast of nullable references (#5386)Thomas Lively2023-01-041-5/+5
| | | | | | | This new cast configuration was not expressible with the legacy cast instructions. Although it is valid in Wasm, do not allow nullable casts of non-nullable references, since those would unnecessarily lose type information. Convert such casts to be non-nullable during expression finalization.
* Support `ref.test null` (#5368)Thomas Lively2022-12-211-1/+1
| | | This new variant of ref.test returns 1 if the input is null.
* Update RefCast representation to drop extra HeapType (#5350)Thomas Lively2022-12-201-1/+7
| | | | | | | | | The latest upstream version of ref.cast is parameterized with a target reference type, not just a heap type, because the nullability of the result is parameterizable. As a first step toward implementing these new, more flexible ref.cast instructions, change the internal representation of ref.cast to use the expression type as the cast target rather than storing a separate heap type field. For now require that the encoded semantics match the previously allowed semantics, though, so that none of the optimization passes need to be updated.
* Do not optimize public types (#5347)Thomas Lively2022-12-161-0/+39
| | | | | | | | | | | | | | | | | Do not optimize or modify public heap types in any way. Public heap types include the types of imported or exported functions, tables, globals, etc. This is important to maintain the public interface of a module and ensure it can still link interact as intended with the outside world. Also add validation error if we find any nontrivial public types that are not the types of imported or exported functions. This error is meant to help the user ensure that type optimizations are not silently inhibited. In the future, we may want to add options to silence this error or downgrade it to a warning. This commit only updates the type updating machinery to avoid updating public types. It does not update any optimization passes accordingly. Since we avoid modifying public signature types already, this is not expected to break anything, but in the future once we have function subtyping or if we make the error optional, we may have to update some of our optimization passes.
* Allow casting to basic heap types (#5332)Thomas Lively2022-12-081-28/+33
| | | | | | | The standard casting instructions now allow casting to basic heap types, not just user-defined types, but they also require that the intended type and argument type have a common supertype. Update the validator to use the standard rules, update the binary parser and printer to allow basic types, and update the tests to remove or modify newly invalid test cases.
* Fix validation and inlining bugs (#5301)Thomas Lively2022-11-291-2/+5
| | | | | | | | | | | | | | Inlining had a bug where it gave return_calls in inlined callees concrete types even when they should have remained unreachable. This bug flew under the radar because validation had a bug where it allowed expressions to have concrete types when they should have been unreachable. The fuzzer found this bug by adding another pass after inlining where the unexpected types caused an assertion failure. Fix the bugs and add a test that would have triggered the inlining bug. Unfortunately the test would have also passed before this change due to the validation bug, but it's better than nothing. Fixes #5294.
* Validator: Print the field number on subtyping errors (#5297)Alon Zakai2022-11-291-4/+6
|
* [NFC] Expand comment about validating function type features (#5286)Thomas Lively2022-11-221-1/+3
| | | This addresses feedback missed in #5279.
* Validate that GC is enabled for rec groups and supertypes (#5279)Thomas Lively2022-11-221-0/+3
| | | | | | | | | Update `HeapType::getFeatures` to report that GC is used for heap types that have nontrivial recursion groups or supertypes. Update validation to check the features on function heap types, not just their individual params and results. This fixes a fuzz bug in #5239 where initial contents included a rec group but the fuzzer disabled GC. Since the resulting module passed validation, the rec groups made it into the binary output, making the type section malformed.
* Implement `array.new_data` and `array.new_elem` (#5214)Thomas Lively2022-11-071-0/+68
| | | | | | | | | In order to test them, fix the binary and text parsers to accept passive data segments even if a module has no memory. In addition to parsing and emitting the new instructions, also implement their validation and interpretation. Test the interpretation directly with wasm-shell tests adapted from the upstream spec tests. Running the upstream spec tests directly would require fixing too many bugs in the legacy text parser, so it will have to wait for the new text parser to be ready.
* Fix binary parsing of data segment memory (#5208)Thomas Lively2022-11-031-3/+5
| | | | | | | | | | | | The binary parser was eagerly getting the name of memories to set the `memory` field of data segments, but that meant that when the memory names were updated later while parsing the names section, the data segment memory fields would become out of date. Update the issue by deferring setting the `memory` fields like we do for other parts of IR that reference memories. Also fix a segfault in the validator that was triggered by the reproducer for this bug before the bug was fixed. Fixes #5204.
* [NFC] Mention relevant flags in validator errors (#5203)Alon Zakai2022-11-011-93/+116
| | | | | | | | | | E.g. Atomic operation (atomics are disabled) => Atomic operations require threads [--enable-threads]
* Remove excessive validation that is not in the wasm spec (#5167)Alon Zakai2022-10-201-28/+1
| | | | | | | | Specifically if a segment offset was a const, we checked that it made sense. But the wasm spec doesn't do that, and it actually causes some issues (#5163). In theory this extra validation might be useful - compile-time error rather than runtime - but if we want this it should probably be an optional thing, like an opt-in flag or a --lint pass or such.