summaryrefslogtreecommitdiff
path: root/test/lit/binary
Commit message (Collapse)AuthorAgeFilesLines
* Support control flow inputs in IRBuilder (#7149)Thomas Lively2024-12-134-38/+0
| | | | | | | | | | | | | | | | | | | | Since multivalue was standardized, WebAssembly has supported not only multiple results but also an arbitrary number of inputs on control flow structures, but until now Binaryen did not support control flow input. Binaryen IR still has no way to represent control flow input, so lower it away using scratch locals in IRBuilder. Since both the text and binary parsers use IRBuilder, this gives us full support for parsing control flow inputs. The lowering scheme is mostly simple. A local.set writing the control flow inputs to a scratch local is inserted immediately before the control flow structure begins and a local.get retrieving those inputs is inserted inside the control flow structure before the rest of its body. The only complications come from ifs, in which the inputs must be retrieved at the beginning of both arms, and from loops, where branches to the beginning of the loop must be transformed so their values are written to the scratch local along the way. Resolves #6407.
* Fixup block-nested pops even when EH is not enabled (#7130)Thomas Lively2024-12-032-2/+2
| | | | | | | | | | While parsing a binary file, there may be pops that need to be fixed up even if EH is not (yet) enabled because the target features section has not been parsed yet. Previously `EHUtils::handleBlockNestedPops` did not do anything if EH was not enabled, so the binary parser would fail to fix up pops in that case. Add an optional parameter to override this behavior so the parser can fix up pops unconditionally. Fixes #7127.
* Use IRBuilder in the binary parser (#6963)Thomas Lively2024-11-267-146/+169
| | | | | | | | | | | | | | | | | | | | | | IRBuilder is a utility for turning arbitrary valid streams of Wasm instructions into valid Binaryen IR. It is already used in the text parser, so now use it in the binary parser as well. Since the IRBuilder API for building each intruction requires only the information that the binary and text formats include as immediates to that instruction, the parser is now much simpler than before. In particular, it does not need to manage a stack of instructions to figure out what the children of each expression should be; IRBuilder handles this instead. There are some differences between the IR constructed by IRBuilder and the IR the binary parser constructed before this change. Most importantly, IRBuilder generates better multivalue code because it avoids eagerly breaking up multivalue results into individual components that might need to be immediately reassembled into a tuple. It also parses try-delegate more correctly, allowing the delegate to target arbitrary labels, not just other `try`s. There are also a couple superficial differences in the generated label and scratch local names. As part of this change, add support for recording binary source locations in IRBuilder.
* Read the names section first (#7074)Thomas Lively2024-11-131-4/+5
| | | | | | | | | Rather than back-patching names when we get to the names section in the binary reader, skip ahead to read the names section before anything else so we can use the final names right away. This is a prerequisite for using IRBuilder in the binary reader. The only functional change is that we now allow empty local names. Empty names are perfectly valid.
* [NFC] Eagerly create segments when parsing datacount (#6958)Thomas Lively2024-09-192-0/+9
| | | | | | | | | The purpose of the datacount section is to pre-declare how many data segments there will be so that engines can allocate space for them and not have to back patch subsequent instructions in the code section that refer to them. Once we use IRBuilder in the binary parser, we will have to have the data segments available by the time we parse instructions that use them, so eagerly construct the data segments when parsing the datacount section.
* Save build ID in a source map (#6799)Marcin Kolny2024-08-152-0/+7
| | | | | | | This is based on these two proposals: * https://github.com/WebAssembly/tool-conventions/blob/main/BuildId.md * https://github.com/tc39/source-map/blob/main/proposals/debug-id.md
* Heap type `none` requires GC (#6840)Thomas Lively2024-08-141-3/+0
| | | | | | Since reference types only introduced function and extern references, all of the types in the `any` hierarchy require GC, including `none`. Fixes #6839.
* Use Names::getValidNameGivenExisting in binary reading (#6793)Alon Zakai2024-07-315-2/+45
| | | | | | We had a TODO to use it once Names was optimized, which it has been. The Names version is also far faster. When building https://github.com/JetBrains/kotlinconf-app it saves 70 seconds(!).
* Error more clearly on wasm components (#6751)Alon Zakai2024-07-172-0/+6
| | | | | | Component binary format: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Binary.md#component-definitions Context: https://github.com/WebAssembly/binaryen/issues/6728#issuecomment-2231288924
* Fix DataSegment name handling (#6673)Alon Zakai2024-06-172-0/+25
| | | | | | | | | | | | | | | | | | | The code used i instead of index, as in this pseudocode: for i in range(num_names): index = readU32LEB() # index of the data segment to name name = readName() # name to give that segment data[i] = name # XXX 'i' should be 'index' That (funnily enough) happened to always work before since we write names in order. That is, normally given segments A,B,C we'd write then in the names section as A,B,C. Then the reader, which had the bug, would always have i and index identical in value anyhow. But if a wasm producer used different indexes, a problem could happen. To test this, add a binary file that has a reversed name section. Fixes #6672
* Fix binary parser of declarative element segments (#6618)Rikito Taniguchi2024-06-032-0/+28
| | | | | | | | | | | | | | | The parser was incorrectly handling the parsing of declarative element segments whose `init` is a `vec(expr)`. https://webassembly.github.io/spec/core/binary/modules.html#element-section Binry parser was simply reading a single `u32LEB` value for `init` instead of parsing a expression regardless `usesExpressions = true`. This commit updates the `WasmBinaryReader::readElementSegments` function to correctly parse the expressions for declarative element segments by calling `readExpression` instead of `getU32LEB` when `usesExpressions = true`. Resolves the parsing exception: "[parse exception: bad section size, started at ... not being equal to new position ...]" Related discussion: https://github.com/tanishiking/scala-wasm/issues/136
* [EH] Rename old EH tests from -old to -legacy (#6627)Heejin Ahn2024-05-282-0/+0
| | | | This renames old EH tests in the form of `-eh-old.wast` to `-eh-legacy.wast`, to be clearer in names.
* Simplify scratch local calculation (#6583)Thomas Lively2024-05-131-7/+4
| | | | | | | | | | | Change `countScratchLocals` to return the count and type of necessary scratch locals. It used to encode them as keys in the global map from scratch local types to local indices, which could not handle having more than one scratch local of a given type and was generally harder to reason about due to its use of global state. Take the opportunity to avoid emitting unnecessary scratch locals for `TupleExtract` expressions that will be optimized to not use them. Also simplify and better document the calculation of the mapping from IR indices to binary indices for all locals, scratch and non-scratch.
* Allow DWARF and multivalue together (#6570)Heejin Ahn2024-05-062-0/+106
| | | | | | | | | This allows writing of binaries with DWARF info when multivalue is enabled. Currently we just crash when both are enabled together. This just assumes, unless we have run DWARF-invalidating passes, all locals added for tuples or scratch locals would have been added at the end of the local list, so just printing all locals in order would preserve the DWARF info. Tuple locals are expanded in place and scratch locals are added at the end.
* [EH] Rename -eh lit test names to -eh-old (#6227)Heejin Ahn2024-01-222-1/+1
| | | | | | | This renames all existing EH lit tests with filenames `*eh*` to `*eh-old*`. This is a prep work so that we can add tests for the new EH spec using `*eh*`. The reason I'm trying to split old and new EH test files is we don't support fuzzing for the new EH yet and I wouldn't want to exclude old EH tests from fuzzing too because of that.
* Error on multivalue inputs that we do not handle (#5962)Alon Zakai2023-09-204-0/+38
| | | | | | Before in getType() we silently dropped the params of a signature type. Now we verify that it is none, or we error. Helps #5950
* Fix stacky-nn-tuple.test.wasm (#5934)Thomas Lively2023-09-131-0/+0
| | | | | This file was updated when we switched to the standard GC opcodes, but the PR updating it did not include a fix to the encoding for nullable and non-nullable references, so this test was incorrect. Fix it.
* Use standard GC encodings by default (#5873)Thomas Lively2023-09-121-0/+0
| | | | The legacy encodings remain available for now by defining USE_LEGACY_GC_ENCODINGS at build time.
* Remove the GCNNLocals feature (#5080)Thomas Lively2023-08-311-2/+2
| | | | | Now that the WasmGC spec has settled on a way of validating non-nullable locals, we no longer need this experimental feature that allowed nonstandard uses of non-nullable locals.
* Simplify and consolidate type printing (#5816)Thomas Lively2023-08-241-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When printing Binaryen IR, we previously generated names for unnamed heap types based on their structure. This was useful for seeing the structure of simple types at a glance without having to separately go look up their definitions, but it also had two problems: 1. The same name could be generated for multiple types. The generated names did not take into account rec group structure or finality, so types that differed only in these properties would have the same name. Also, generated type names were limited in length, so very large types that shared only some structure could also end up with the same names. Using the same name for multiple types produces incorrect and unparsable output. 2. The generated names were not useful beyond the most trivial examples. Even with length limits, names for nontrivial types were extremely long and visually noisy, which made reading disassembled real-world code more challenging. Fix these problems by emitting simple indexed names for unnamed heap types instead. This regresses readability for very simple examples, but the trade off is worth it. This change also reduces the number of type printing systems we have by one. Previously we had the system in Print.cpp, but we had another, more general and extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the old type printing system from Print.cpp and replace it with a much smaller use of the new system. This requires significant refactoring of Print.cpp so that PrintExpressionContents object now holds a reference to a parent PrintSExpression object that holds the type name state. This diff is very large because almost every test output changed slightly. To minimize the diff and ease review, change the type printer in wasm-type.cpp to behave the same as the old type printer in Print.cpp except for the differences in name generation. These changes will be reverted in much smaller PRs in the future to generally improve how types are printed.
* Remove legacy WasmGC instructions (#5861)Thomas Lively2023-08-094-23/+0
| | | | | Remove old, experimental instructions and type encodings that will not be shipped as part of WasmGC. Updating the encodings and text format to match the final spec is left as future work.
* Fix binary writing of strings without GC enabled (#5836)Alon Zakai2023-07-311-0/+12
|
* Update br_on_cast binary and text format (#5762)Thomas Lively2023-06-122-41/+0
| | | | | | | | | | | | The final versions of the br_on_cast and br_on_cast_fail instructions have two reference type annotations: one for the input type and one for the cast target type. In the binary format, this is represented as a flags byte followed by two encoded heap types. Upgrade all of the tests at once to use the new versions of the instructions and drop support for the old instructions from the text parser. Keep support in the binary parser to avoid breaking users, though. Drop some binary tests of deprecated instruction encodings that would be more effort to update than they're worth. Re-land with fixes of #5734
* Revert "Update br_on_cast binary and text format (#5734)" (#5740)Alon Zakai2023-05-232-0/+41
| | | | | | | This reverts commit b7b1d0df29df14634d2c680d1d2c351b624b4fbb. See comment at the end of #5734: It turns out that dropping the old opcodes causes problems for current users, so let's revert this for now, and later we can figure out how best to do the update.
* Update br_on_cast binary and text format (#5734)Thomas Lively2023-05-192-41/+0
| | | | | | | | | | The final versions of the br_on_cast and br_on_cast_fail instructions have two reference type annotations: one for the input type and one for the cast target type. In the binary format, this is represented as a flags byte followed by two encoded heap types. Since these instructions have been in flux for a while, do not attempt to maintain backward compatibility with older versions of the instructions. Instead, upgrade all of the tests at once to use the new versions of the instructions. Drop some binary tests of deprecated instruction encodings that would be more effort to update than they're worth.
* [Wasm GC] Automatically make RefCast heap types more precise (#5704)Alon Zakai2023-05-051-2/+2
| | | | | | | | | | | | | | We already did this for nullablilty, and so for the same reasons we should do it for heap types as well. Also, I realized that doing so would solve #5703, which is the new test added for TypeRefining here. The fuzz bug solved here is that our analysis of struct gets/sets will skip copy operations - a read from a field that is written into it. And we skip fallthrough values while doing so, since it doesn't matter if the read goes through an if arm or a cast. An if would automatically get a more precise type during refinalize, so this PR does the same for a cast basically. Fixes #5703
* Fix name deduplication with partial names sections (#5689)Alon Zakai2023-04-282-0/+22
| | | | | | | | | | We already deduplicated names in the names section (to defend against a weird binary), but we also need to deduplicate the names of items not in the names section, so they don't overlap with the names that are. See example in the testcase. Normally wasm files use names for all items in each group. This only became noticeable in some wasm-ctor-eval work where new temp globals were added that were not given names.
* Remove the --hybrid and --nominal command line options (#5669)Thomas Lively2023-04-142-27/+0
| | | | | After this change, the only type system usable from the tools will be the standard isorecursive type system. The nominal type system is still usable via the API, but it will be removed entirely in a follow-on PR.
* Convert some tests off of --nominal (#5660)Thomas Lively2023-04-131-1/+1
| | | | | | | | | | | | In preparation to remove the nominal type system, which is nonstandard and not usable for modules with nontrivial external linkage requirements, port an initial batch of tests to use the standard isorecursive type system. The port involves reordering input types to ensure that supertypes precede their subtypes and inserting rec groups to ensure that structurally identical types maintain their separate identities. More tests will be ported in future PRs before the nominal type system is removed entirely.
* [Exceptions] Fix error on bad delegate index (#5587)Alon Zakai2023-03-172-0/+17
| | | | Fixes #5584
* Represent ref.as_{func,data,i31} with RefCast (#5413)Thomas Lively2023-01-101-3/+2
| | | | | | | | | | | | | These operations are deprecated and directly representable as casts, so remove their opcodes in the internal IR and parse them as casts instead. For now, add logic to the printing and binary writing of RefCast to continue emitting the legacy instructions to minimize test changes. The few test changes necessary are because it is no longer valid to perform a ref.as_func on values outside the func type hierarchy now that ref.as_func is subject to the ref.cast validation rules. RefAsExternInternalize, RefAsExternExternalize, and RefAsNonNull are left unmodified. A future PR may remove RefAsNonNull as well, since it is also expressible with casts.
* Support `ref.test null` (#5368)Thomas Lively2022-12-211-0/+0
| | | This new variant of ref.test returns 1 if the input is null.
* In --debug mode, print partial wasm data that was read (#5356)Alon Zakai2022-12-153-63/+105
| | | | | | | | | | | | | | | | | | | If wasm-opt or wasm-dis are given an invalid binary, after the error message we can also print out the wasm we did manage to read. That includes global stuff like imports and also all the functions up until there. This can help debugging in some situations. Only do this when --debug is passed as it can be very verbose and in general users might not want it. This is technically easy to do, it turns out, since we already use a thrown exception on an error in parsing, and we fill up the wasm as we go, so it just contains what we've read so far, and we can just print it. Fixes #5344 Also switch an existing test's comments to ;; from # which was noticed here.
* Allow casting to basic heap types (#5332)Thomas Lively2022-12-081-0/+1
| | | | | | | The standard casting instructions now allow casting to basic heap types, not just user-defined types, but they also require that the intended type and argument type have a common supertype. Update the validator to use the standard rules, update the binary parser and printer to allow basic types, and update the tests to remove or modify newly invalid test cases.
* Add standard versions of WasmGC casts (#5331)Thomas Lively2022-12-072-0/+40
| | | | | | | We previously supported only the non-standard cast instructions introduced when we were experimenting with nominal types. Parse the names and opcodes of their standard counterparts and switch to emitting the standard names and opcodes. Port all of the tests to use the standard instructions, but add additional tests showing that the non-standard versions are still parsed correctly.
* Change the default type system to isorecursive (#5239)Thomas Lively2022-11-232-10/+10
| | | | | | | | | | This makes Binaryen's default type system match the WasmGC spec. Update the way type definitions without supertypes are printed to reduce the output diff for MVP tests that do not involve WasmGC. Also port some type-builder.cpp tests from test/example to test/gtest since they needed to be rewritten to work with isorecursive type anyway. A follow-on PR will remove equirecursive types completely.
* Parse and emit `array.len` without a type annotation (#5151)Thomas Lively2022-10-182-0/+18
| | | Test that we can still parse the old annotated form as well.
* Fix binary parsing of the prototype nominal format (#4679)Thomas Lively2022-05-191-13/+13
| | | | | | We were checking that nominal modules only had a single element in their type sections, but that's not correct for the prototype nominal binary format we still want to support. The test for this missed catching the bug because it wasn't actually parsing in nominal mode.
* Parse the prototype nominal binary format (#4644)Thomas Lively2022-05-042-0/+27
| | | | | | In f124a11ca3 we removed support for the prototype nominal binary format entirely, but that means that we can no longer parse older binary modules that used that format. Fix this regression by restoring the ability to parse the prototype binary format.
* [Wasm GC] Fix stacky non-nullable tuples (#4561)Alon Zakai2022-03-312-0/+112
| | | | | #4555 fixed validation for such tuples, but we also did not handle them in "stacky" code using pops etc., due to a logic bug in the binary reading code.
* Warn about and ignore empty local/param names in name section (#4426)Alon Zakai2022-01-072-0/+14
| | | | | | | Fixes the crash in #4418 Also replace the .at() there with better logic to handle imported functions. See WebAssembly/wabt#1799 for details on why wabt sometimes emits this.
* [GC] Move heap-types.wast out of lit/test/binary/ (#4424)Heejin Ahn2022-01-041-150/+0
| | | Apparently it is not a binary test?
* [EH] Fixup nested pops after reading stacky binary (#4420)Heejin Ahn2022-01-042-0/+66
| | | | | | When reading stacky code in the binary reader, we create `block`s to make it fit into Binaryen AST, within which `pop`s can be nested, making the resulting AST invalid. This PR runs the fixup function after reading each `Try` to fix this.
* Add binary format parse checking for ref.as input type (#4389)Alon Zakai2021-12-162-0/+6
| | | | | | | If that type is not valid then we cannot even create and finalize the node, which means we'd hit an assertion inside finalize(), before we reach the validator. Fixes #4383
* Print heap types in text format in nominal mode (#4316)Alon Zakai2021-11-081-5/+5
| | | | | | | Without this roundtripping may not work in nominal mode, as we might not assign the expected heap types in the right places. Specifically, when the signature matches but the nominal types are distinct then we need to keep them that way (and the sugar in the text format parsing will merge them).
* Switch from "extends" to M4 nominal syntax (#4248)Thomas Lively2021-10-141-7/+7
| | | | | | | | Switch from "extends" to M4 nominal syntax Change all test inputs from using the old (extends $super) syntax to using the new *_subtype syntax for their inputs and also update the printer to emit the new syntax. Add a new test explicitly testing the old notation to make sure it keeps working until we remove support for it.
* [Wasm GC] Implement static (rtt-free) StructNew, ArrayNew, ArrayInit (#4172)Alon Zakai2021-09-231-0/+150
| | | | | | | | | See #4149 This modifies the test added in #4163 which used static casts on dynamically-created structs and arrays. That was technically not valid (as we won't want users to "mix" the two forms). This makes that test 100% static, which both fixes the test and gives test coverage to the new instructions added here.
* Support new dylink.0 custom section format (#4141)Sam Clegg2021-09-112-3/+3
| | | | | | | See also: spec change: https://github.com/WebAssembly/tool-conventions/pull/170 llvm change: https://reviews.llvm.org/D109595 wabt change: https://github.com/WebAssembly/wabt/pull/1707 emscripten change: https://github.com/emscripten-core/emscripten/pull/15019
* Handle extra info in dylink section (#4112)Sam Clegg2021-08-312-0/+15
If extra data is found in this section simply propagate it. Also, remove some dead code from wasm-binary.cpp.