| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
The binary reader has special handling for blocks immediately nested
inside other blocks to eliminate recursion while parsing very deep
stacks of blocks. This special handling did not record binary locations
for the nested blocks, though.
Add logic to record binary locations for nested blocks. This binary
reading code is about to be replaced with completely different code that
uses IRBuilder instead, but this change will eliminate some test
differences that we would otherwise see when we make that change.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of setting the local names at the end of binary reading, eagerly
set them before parsing function bodies. This is NFC now, but will fix a
future bug once the binary reader uses IRBuilder. IRBuilder can
introduce new scratch locals, and it gives them the names `$scratch`,
`$scratch_1`, etc. If the name section includes locals with the same
names and we set those local names after parsing function bodies, then
we can end up with multiple locals with the same names. Setting the
names before parsing the function bodies ensures that IRBuilder will
generate different names for the scratch locals.
The alternative fix would be to generate fresh names when setting names
from the name section, but it is better to respect the names in the name
section and use fresh names for the newly introduced scratch locals
instead.
|
|
|
|
|
|
|
|
|
| |
Rather than back-patching names when we get to the names section in the
binary reader, skip ahead to read the names section before anything else
so we can use the final names right away. This is a prerequisite for
using IRBuilder in the binary reader.
The only functional change is that we now allow empty local names. Empty
names are perfectly valid.
|
|
|
| |
See https://github.com/WebAssembly/memory64/pull/92
|
|
|
|
|
|
|
|
| |
This has not been emitted in LLVM since
https://github.com/llvm/llvm-project/commit/3f34e1b883351c7d98426b084386a7aa762aa366.
The corresponding proposed tool-conventions change:
https://github.com/WebAssembly/tool-conventions/pull/236
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When EH+GC are enabled then wasm has non-nullable types, and the
sent exnref should be non-nullable. In BinaryenIR we use the non-
nullable type all the time, which we also do for function references
and other things; we lower it if GC is not enabled to a nullable type
for the binary format (see `WasmBinaryWriter::writeType`, to which
comments were added in this PR). That is, this PR makes us handle
exnref the same as those other types.
A new test verifies that behavior. Various existing tests are updated
because ReFinalize will now use the more refined type, so this is
an optimization. It is also a bugfix as in #6987 we started to emit
the refined form in the fuzzer, and this PR makes us handle it
properly in validation and ReFinalization.
|
|
|
|
|
|
|
| |
Support 5-segment source mappings, which add a name.
Reference:
https://github.com/tc39/source-map/blob/main/source-map-rev3.md#proposed-format
|
|
|
|
|
|
|
| |
This raises the number of locals accepted by the binary parser to the
absolute limit in the spec. A warning is now printed when writing a
binary file if the Web limit of 50,000 locals is exceeded.
Fixes #6968.
|
|
|
|
|
|
|
|
|
|
| |
Note: FP16 is a little different from F32/F64 since it can't represent
the full 2^16 integer range. 65504 is the max whole integer. This leads
to some slightly strange behavior when converting integers greater than
65504 since they become infinity.
Specified at
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
|
|
|
|
|
|
|
|
|
| |
The purpose of the datacount section is to pre-declare how many data
segments there will be so that engines can allocate space for them
and not have to back patch subsequent instructions in the code section
that refer to them. Once we use IRBuilder in the binary parser, we will
have to have the data segments available by the time we parse
instructions that use them, so eagerly construct the data segments when
parsing the datacount section.
|
|
|
|
|
|
|
|
| |
In preparation for using IRBuilder in the binary parser, eagerly create
Functions when parsing the function section so that they are already
created once we parse the code section. IRBuilder will require the
functions to exist when parsing calls so it can figure out what type
each call should have, even when there is a call to a function whose
body has not been parsed yet.
|
|
|
|
|
|
|
|
| |
We were doing a debug logging for every LEB byte. It turns out that the
isDebugEnabled() calls are expensive when called so frequently: in a
release+assertion build, even with debug disabled, these checks are the
highest thing in the profile. This PR removes the checks, which makes
binary reading 12% faster.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unlike other module elements, types are not stored on the `Module`.
Instead, they are collected by traversing the IR before printing and
binary writing. The code that collects the types tries to optimize the
order of rec groups based on the number of times each type is used. As a
result, the output order of types generally has no relation to the input
order of types. In addition, most type optimizations rewrite the types
into a single large rec group, and the order of types in that group is
essentially arbitrary. Changes to the code for counting type uses,
sorting types, or sorting rec groups can yield very large changes in the
output order of types, producing test diffs that are hard to review and
potentially harming the readability of tests by moving output types away
from the corresponding input types.
To help make test output more stable and readable, introduce a tool
option that causes the order of output types to match the order of input
types as closely as possible. It is implemented by having the parsers
record the indices of the input types on the `Module` just like they
already record the type names. The `GlobalTypeRewriter` infrastructure
used by type optimizations associates the new types with the old indices
just like it already does for names and also respects the input order
when rewriting types into a large recursion group.
By default, wasm-opt and other tools clear the recorded type indices
after parsing the module, so their default behavior is not modified by
this change.
Follow-on PRs will use the new flag in more tests, which will generate
large diffs but leave the tests in stable, more readable states that
will no longer change due to other changes to the optimizing type
sorting logic.
|
|
|
|
|
|
|
| |
This renames `Catch(All)_P3` enum to denote the old Phase 3
`catch(_all)` instructions to `Catch(All)_Legacy`, which sounds clearer.
This is also to be consistent with
https://github.com/llvm/llvm-project/pull/107187.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specified at
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
A few notes:
- The F32x4 and F64x2 versions of madd and nmadd are missing spect
tests.
- For madd, the implementation was incorrectly doing `(b*c)+a` where it
should be `(a*b)+c`.
- For nmadd, the implementation was incorrectly doing `(-b*c)+a` where
it should be `-(a*b)+c`.
- There doesn't appear to be a great way to actually implement a fused
nmadd, but the spec allows the double rounded version I added.
|
|
|
|
|
|
|
| |
The instructions relaxed_fma and relaxed_fnma have been renamed to
relaxed_madd and relaxed_nmadd.
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#binary-format
|
|
|
|
| |
Specified at
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
|
|
|
| |
Ensure the "fp16" feature is enabled for FP16 instructions.
|
|
|
|
| |
Specified at
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
|
|
|
|
|
|
|
|
| |
The leading bytes that indicate what kind of heap type is being defined
are bytes, but we were previously treating them as SLEB128-encoded
values. Since we emit the smallest LEB encodings possible, we were
writing the correct bytes in output files, but we were also improperly
accepting binaries that used more than one byte to encode these values.
This was caught by an upstream spec test.
|
|
|
|
|
|
|
|
| |
Replace code that checked `isStruct()`, `isArray()`, etc. in sequence
with uses of `HeapType::getKind()` and switch statements. This will make
it easier to find the code that needs updating if/when we add new heap
type kinds in the future. It also makes it much easier to find code that
already needs updating to handle continuation types by grepping for
"TODO: cont".
|
|
|
|
|
| |
Also use TableInit in the interpreter to initialize module's table
state, which will now handle traps properly, fixing #6431
|
|
|
|
|
|
|
| |
This is based on these two proposals:
* https://github.com/WebAssembly/tool-conventions/blob/main/BuildId.md
* https://github.com/tc39/source-map/blob/main/proposals/debug-id.md
|
|
|
|
| |
Specified at
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
|
|
|
|
| |
Specified at
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
|
|
|
|
| |
Specified at
https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
|
|
|
|
|
| |
Single-segment mappings were already handled in readNextDebugLocation,
but not in readSourceMapHeader.
|
|
|
|
|
| |
As a followup we could probably make these more consistent. For example,
we could use a single char prefix for defined functions/tables/globals
(e.g. f0/t0/g0)
|
|
|
|
|
|
| |
We had a TODO to use it once Names was optimized, which it has been.
The Names version is also far faster. When building
https://github.com/JetBrains/kotlinconf-app it saves 70 seconds(!).
|
|
|
|
|
|
|
|
|
| |
This abbreviates a common pattern where we first had to check whether a
heap type was basic, then if it was, get its unshared version and
compare it to some expected BasicHeapType.
Suggested in
https://github.com/WebAssembly/binaryen/pull/6771#discussion_r1683005495.
|
|
|
|
|
|
| |
Similar to #6765, but for types instead of heap types. Generalize the
logic for transforming written reference types to types that are
supported without GC so that it will automatically handle shared types
and other new types correctly.
|
|
|
|
|
|
|
|
|
|
| |
We represent `ref.null`s as having bottom heap types, even when GC is
not enabled. Bottom heap types are a feature of the GC proposal, so in
that case the binary writer needs to write the corresponding top type
instead. We previously had separate logic for this for each type
hierarchy in the binary writer, but that did not handle shared types and
would not have automatically handled other new types, either. Simplify
and generalize the implementation and test that we can write `ref.null`s
of shared types without GC enabled.
|
|
|
|
|
|
| |
Component binary format: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Binary.md#component-definitions
Context:
https://github.com/WebAssembly/binaryen/issues/6728#issuecomment-2231288924
|
|
|
|
|
|
|
| |
Implement `ref.i31_shared` the new instruction for creating references
to shared i31s. Implement binary and text parsing and emitting as well
as interpretation. Copy the upstream spec test for i31 and modify it so
that all the heap types are shared. Comment out some parts that we do
not yet support.
|
|
|
|
|
|
|
|
|
| |
Rename instructions `extern.internalize` into `any.convert_extern` and
`extern.externalize` into `extern.convert_any` to follow more closely
the spec. This was changed in
https://github.com/WebAssembly/gc/issues/432.
The legacy name is still accepted in text inputs and in the C and JS
APIs.
|
|
|
|
| |
Such as `ref.eq`, `i31.get_{s,u}`, and `array.len`. Also validate that
struct and array operations work on shared structs and arrays.
|
|
|
| |
Fixes #6695
|
|
|
|
|
|
| |
That child must be a reference, as `finalize()` assumes so. To avoid an
assertion, error early.
Fixes #6696
|
|
|
|
|
| |
The result cannot be `none` or `unreachable` etc.
Fixes #6694
|
|
|
|
|
|
| |
Add an `isUTF8` utility and use it in both the text and binary parsers.
Add missing checks for overlong encodings and overlarge code points in
our WTF8 reader, which the new utility uses. Re-enable the spec tests
that test UTF-8 validation.
|
|
|
| |
And re-enable the globals.wast spec test, which checks this.
|
|
|
|
|
|
| |
Fix the wast parser to accept IDs on quoted modules, remove tests that are
invalidated by the multimemory proposal, and add validation that the total
number of variables in a function is less than 2^32 and that the code section is
present if there is a non-empty function section.
|
|
|
|
|
|
|
|
|
|
|
| |
Implement binary and text parsing and printing of shared basic heap types and
incorporate them into the type hierarchy.
To avoid the massive amount of code duplication that would be necessary if we
were to add separate enum variants for each of the shared basic heap types, use
bit 0 to indicate whether the type is shared and replace `getBasic()` with
`getBasic(Unshared)`, which clears that bit. Update all the use sites to record
whether the original type was shared and produce shared or unshared output
without code duplication.
|
|
|
|
|
|
| |
Rather than treating them as custom sections. Also fix UB where invalid
`Section` enum values could be used as keys in a map. Use the raw `uint8_t`
section IDs as keys instead. Re-enable a disabled spec test that was failing
because of this bug and UB.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code used i instead of index, as in this pseudocode:
for i in range(num_names):
index = readU32LEB() # index of the data segment to name
name = readName() # name to give that segment
data[i] = name # XXX 'i' should be 'index'
That (funnily enough) happened to always work before since we write names in
order. That is, normally given segments A,B,C we'd write then in the names section
as A,B,C. Then the reader, which had the bug, would always have i and index
identical in value anyhow. But if a wasm producer used different indexes, a
problem could happen.
To test this, add a binary file that has a reversed name section.
Fixes #6672
|
|
|
|
| |
Also update the parser so that implicit type uses are not matched with shared
function types.
|
|
|
|
|
| |
Add the feature and flags to enable and disable it. Require the new feature to
be enabled for shared heap types to validate. To make the test work, update the
validator to actually check features for global types.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The parser was incorrectly handling the parsing of declarative element segments whose `init` is a `vec(expr)`.
https://webassembly.github.io/spec/core/binary/modules.html#element-section
Binry parser was simply reading a single `u32LEB` value for `init`
instead of parsing a expression regardless `usesExpressions = true`.
This commit updates the `WasmBinaryReader::readElementSegments` function
to correctly parse the expressions for declarative element segments by
calling `readExpression` instead of `getU32LEB` when `usesExpressions = true`.
Resolves the parsing exception:
"[parse exception: bad section size, started at ... not being equal to new position ...]"
Related discussion: https://github.com/tanishiking/scala-wasm/issues/136
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this PR we generate global.gets in globals, which we did not do before.
We do that by replacing makeConst (the only thing we did before, for the
contents of globals) with makeTrivial, and add code to makeTrivial to sometimes
make a global.get. When no suitable global exists, makeGlobalGet will emit a
constant, so there is no danger in trying.
Also raise the number of globals a little.
Also explicitly note the current limitation of requiring all tuple globals to contain
tuple.make and nothing else, including not global.get, and avoid adding such
invalid global.gets in tuple globals in the fuzzer.
|
| |
|