| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Some places assumed a 32-bit index.
|
|
|
|
|
|
|
|
|
| |
To avoid having two separate topological sort utilities in the code
base, replace remaining uses of the old DFS-based, CRTP topological sort
with the newer Kahn's algorithm implementation.
This would be NFC, except that the new topological sort produces a
different order than the old topological sort, so the output of some
passes is reordered.
|
|
|
|
|
| |
Audit the remaining ocurrences of `== HeapType::` and fix those that did
not handle shared types correctly. Add tests for some of the fixes;
others are NFC but clarify the code.
|
|
|
|
|
| |
Also use TableInit in the interpreter to initialize module's table
state, which will now handle traps properly, fixing #6431
|
| |
|
|
|
|
|
|
|
| |
`ref.null` of shared types should only be allowed when shared-everything
is enabled, but we were previously checking only that reference types
were enabled when validating `ref.null`. Update the code to check all
features required by the null type and factor out shared logic for
printing lists of missing feature options in error messages.
|
|
|
|
|
|
| |
When we switched to the new type printing machinery, we inserted this
extra space to minimize the diff in the test output compared with the
previous type printer. Improve the quality of the printed output by
removing it.
|
|
|
|
|
|
|
|
|
| |
Rename instructions `extern.internalize` into `any.convert_extern` and
`extern.externalize` into `extern.convert_any` to follow more closely
the spec. This was changed in
https://github.com/WebAssembly/gc/issues/432.
The legacy name is still accepted in text inputs and in the C and JS
APIs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The stringview types from the stringref proposal have three irregularities that
break common invariants and require pervasive special casing to handle properly:
they are supertypes of `none` but not subtypes of `any`, they cannot be the
targets of casts, and they cannot be used to construct nullable references. At
the same time, the stringref proposal has been superseded by the imported
strings proposal, which does not have these irregularities. The cost of
maintaing and improving our support for stringview types is no longer worth the
benefit of supporting them.
Simplify the code base by entirely removing the stringview types and related
instructions that do not have analogues in the imported strings proposal and do
not make sense in the absense of stringviews.
Three remaining instructions, `stringview_wtf16.get_codeunit`,
`stringview_wtf16.slice`, and `stringview_wtf16.length` take stringview operands
in the stringref proposal but cannot be removed because they lower to operations
from the imported strings proposal. These instructions are changed to take
stringref operands in Binaryen IR, and to allow a graceful upgrade path for
users of these instructions, the text and binary parsers still accept but ignore
`string.as_wtf16`, which is the instruction used to convert stringrefs to
stringviews. The binary writer emits code sequences that use scratch locals and `string.as_wtf16` to keep the output valid.
Future PRs will further align binaryen with the imported strings proposal
instead of the stringref proposal, for example by making `string` a subtype of
`extern` instead of a subtype of `any` and by removing additional instructions
that do not have analogues in the imported strings proposal.
|
|
|
|
| |
precompute (#6561)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new text parser is faster and more standards compliant than the old text
parser. Enable it by default in wasm-opt and update the tests to reflect the
slightly different results it produces. Besides following the spec, the new
parser differs from the old parser in that it:
- Does not synthesize `loop` and `try` labels unnecessarily
- Synthesizes different block names in some cases
- Parses exports in a different order
- Parses `nop`s instead of empty blocks for empty control flow arms
- Does not support parsing Poppy IR
- Produces different error messages
- Cannot parse `pop` except as the first instruction inside a `catch`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a combined commit covering multiple PRs fixing the handling of return
calls in different areas. The PRs are all landed as a single commit to ensure
internal consistency and avoid problems with bisection.
Original PR descriptions follow:
* Fix inlining of `return_call*` (#6448)
Previously we transformed return calls in inlined function bodies into normal
calls followed by branches out to the caller code. Similarly, when inlining a
`return_call` callsite, we simply added a `return` after the body inlined at the
callsite. These transformations would have been correct if the semantics of
return calls were to call and then return, but they are not correct for the
actual semantics of returning and then calling.
The previous implementation is observably incorrect for return calls inside try
blocks, where the previous implementation would run the inlined body within the
try block, but the proper semantics would be to run the inlined body outside the
try block.
Fix the problem by transforming inlined return calls to branches followed by
calls rather than as calls followed by branches. For the case of inlined return
call callsites, insert branches out of the original body of the caller and
inline the body of the callee as a sibling of the original caller body. For the
other case of return calls appearing in inlined bodies, translate the return
calls to branches out to calls inserted as siblings of the original inlined
body.
In both cases, it would have been convenient to use multivalue block return to
send call parameters along the branches to the calls, but unfortunately in our
IR that would have required tuple-typed scratch locals to unpack the tuple of
operands at the call sites. It is simpler to just use locals to propagate the
operands in the first place.
* Fix interpretation of `return_call*` (#6451)
We previously interpreted return calls as calls followed by returns, but that is
not correct both because it grows the size of the execution stack and because it
runs the called functions in the wrong context, which can be observable in the
case of exception handling.
Update the interpreter to handle return calls correctly by adding a new
`RETURN_CALL_FLOW` that behaves like a return, but carries the arguments and
reference to the return-callee rather than normal return values.
`callFunctionInternal` is updated to intercept this flow and call return-called
functions in a loop until a function returns with some other kind of flow.
Pull in the upstream spec tests return_call.wast, return_call_indirect.wast, and
return_call_ref.wast with light editing so that we parse and validate them
successfully.
* Handle return calls in wasm-ctor-eval (#6464)
When an evaluated export ends in a return call, continue evaluating the
return-called function. This requires propagating the parameters, handling the
case that the return-called function might be an import, and fixing up local
indices in case the final function has different parameters than the original
function.
* Update effects.h to handle return calls correctly (#6470)
As far as their surrounding code is concerned return calls are no different from
normal returns. It's only from a caller's perspective that a function containing
a return call also has the effects of the return-callee. To model this more
precisely in EffectAnalyzer, stash the throw effect of return-callees on the
side and only merge it in at the end when analyzing the effects of a full
function body.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update identifiers used in tests to use a format supported by the new text
parser, i.e. either the standard format with its limited set of allowed
characters or the non-standard `$"..."` format. Notably, any name containing
square or curly braces now uses the string format.
Input automatically updated with this script:
https://gist.github.com/tlively/4e22311736661849e641d02e521a0748
The printer is updated to properly escape names in more places as well. The
logic for escaping names is moved to a common location so that the type
printing logic in wasm-type.cpp can use it as well.
|
|
|
|
|
|
|
| |
#6244 tried to do this but was not quite right. It treated a string like an array
or a struct, which means create a global for it. But just creating a global isn't
enough, as it needs to also be sorted in the right place etc. which requires
changes in other places. But there is a much simpler solution here: string
constants are just constants, which we can emit in-line, so do that.
|
|
|
|
| |
Instead of e.g. `(i32 i32)`, use `(tuple i32 i32)`. Having a keyword to
introduce the s-expression is more consistent with the rest of the language.
|
| |
|
|
|
|
|
| |
Fixes a fuzz testcase for wasm-ctor-eval.
Add the beginnings of a polyfill for stdckdint.h to help that.
|
|
|
|
| |
This should be a runtime error, not a validator error. It caused a fuzzer failure on
wasm-ctor-eval.
|
|
|
|
|
| |
Update the legacy text parser and all tests to use the standard text format for shared memories, e.g. `(memory $m 1 1 shared)` rather than `(memory $m (shared 1 1))`. Also remove support for non-standard in-line "data" or "segment" declarations.
This change makes the tests more compatible with the new text parser, which only supports the standard format.
|
|
|
|
|
|
|
|
|
|
| |
Previously the lit test update script interpreted module names as the names of
import items and export names as the names of export items, but it is more
precise to use the actual identifiers of the imported or exported items as the
names instead.
Update update_lit_checks.py to use a more correct regex to match names and to
correctly use the identifiers of import and export items as their names. In some
cases this can improve the readability of test output.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We previously supported a non-standard `(func "name" ...` syntax for declaring
functions exported with the quoted name. Since that is not part of the standard
text format, drop support for it, replacing it with the standard `(func $name
(export "name") ...` syntax instead.
Also replace our other usage of the quoted form in our text output, which was
where we quoted names containing characters that are not allowed to appear in
standard names. To handle that case, adjust our output from `"$name"` to
`$"name"`, which is the standards-track way of supporting such names. Also fix
how we detect non-standard name characters to match the spec.
Update the lit test output generation script to account for these changes,
including by making the `$` prefix on names mandatory. This causes the script to
stop interpreting declarative element segments with the `(elem declare ...`
syntax as being named "declare", so prevent our generated output from regressing
by counting "declare" as a name in the script.
|
|
|
|
|
|
|
|
| |
Once support for tuple.extract lands in the new WAT parser, this arity immediate
will let the parser determine how many values it should pop off the stack to
serve as the tuple operand to `tuple.extract`. This will usually coincide with
the arity of a tuple-producing instruction on top of the stack, but in the
spirit of treating the input as a proper stack machine, it will not have to and
the parser will still work correctly.
|
|
|
|
|
|
|
|
|
|
| |
Previously, the number of tuple elements was inferred from the number of
s-expression children of the `tuple.make` expression, but that scheme would not
work in the new wat parser, where s-expressions are optional and cannot be
semantically meaningful.
Update the text format to take the number of tuple elements (i.e. the tuple
arity) as an immediate. This new format will be able to be implemented in the
new parser as follow-on work.
|
|
|
|
|
|
|
|
| |
Two trivial places did not handle that case, and assumed an exported function
was actually defined (and not imported).
Also add some const stuff to fix compilation after this change.
This was discovered by #6026
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases tuples are obviously not needed, such as when they are only used
in local operations and make/extract. Such tuples are not used as return values or
in control flow structures, so we might as well lower them to individual locals per
lane, which other passes can optimize a lot better.
I believe LLVM does the same with its own tuples: it lowers them as much as
possible, leaving only necessary ones.
Fixes #5923
|
|
|
|
|
| |
Replace i31.new with ref.i31 in the printer, tests, and source code. Continue
parsing i31.new for the time being to allow a graceful transition. Also update
the JS API to reflect the new instruction name.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When printing Binaryen IR, we previously generated names for unnamed heap types
based on their structure. This was useful for seeing the structure of simple
types at a glance without having to separately go look up their definitions, but
it also had two problems:
1. The same name could be generated for multiple types. The generated names did
not take into account rec group structure or finality, so types that differed
only in these properties would have the same name. Also, generated type names
were limited in length, so very large types that shared only some structure
could also end up with the same names. Using the same name for multiple types
produces incorrect and unparsable output.
2. The generated names were not useful beyond the most trivial examples. Even
with length limits, names for nontrivial types were extremely long and visually
noisy, which made reading disassembled real-world code more challenging.
Fix these problems by emitting simple indexed names for unnamed heap types
instead. This regresses readability for very simple examples, but the trade off
is worth it.
This change also reduces the number of type printing systems we have by one.
Previously we had the system in Print.cpp, but we had another, more general and
extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the
old type printing system from Print.cpp and replace it with a much smaller use
of the new system. This requires significant refactoring of Print.cpp so that
PrintExpressionContents object now holds a reference to a parent
PrintSExpression object that holds the type name state.
This diff is very large because almost every test output changed slightly. To
minimize the diff and ease review, change the type printer in wasm-type.cpp to
behave the same as the old type printer in Print.cpp except for the differences
in name generation. These changes will be reverted in much smaller PRs in the
future to generally improve how types are printed.
|
|
|
|
|
|
|
|
|
| |
* Update text output for `ref.cast` and `ref.test`
* Update text output for `array.new_fixed`
* Update tests with new syntax for `ref.cast` and `ref.test`
* Update tests with new `array.new_fixed` syntax
|
|
|
|
|
|
| |
In practice we don't need high addresses, and when they happen the current
implementation can OOM, so exit early on them instead.
Fixes #5893
|
|
|
|
| |
The function type should be printed there just like for non-imported
functions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A cycle of data is something we can't just naively emit as wasm globals. If
at runtime we end up, for example, with an object A that refers to itself,
then we can't just emit
(global $A
(struct.new $A
(global.get $A)))
The struct.get is of this very global, and such a self-reference is invalid. So
we need to break such cycles as we emit them. The simple idea used here
is to find paths in the cycle that are nullable and mutable, and replace the
initial value with a null that is fixed up later in the start function:
(global $A
(struct.new $A
(ref.null $A)))
(func $start
(struct.set
(global.get $A)
(global.get $A)))
)
This is not optimal in terms of breaking cycles, but it is fast (linear time)
and simple, and does well in practice on j2wasm (where cycles in fact
occur).
|
|
|
|
|
|
|
| |
Without the hint, we always look for a valid name using name$0, $1, $2, etc.,
starting from 0, and in some cases that can lead to quadratic behavior.
Noticed on a testcase in the fuzzer that runs for over 24 seconds (I gave up at
that point) but takes only 2 seconds with this.
|
|
|
| |
Like data.drop etc., it notices data segment identity.
|
|
|
|
|
|
|
|
|
|
| |
All top-level Module elements are identified and referred to by Name, but for
historical reasons element and data segments were referred to by index instead.
Fix this inconsistency by using Names to refer to segments from expressions that
use them. Also parse and print segment names like we do for other elements.
The C API is partially converted to use names instead of indices, but there are
still many functions that refer to data segments by index. Finishing the
conversion can be done in the future once it becomes necessary.
|
|
|
|
|
|
|
|
|
|
| |
This fixes wasm-ctor-eval on evalling a GC data structure that contains a
field initialized with an externalized value.
Per the spec this is a constant instruction and I verified that V8 allows this.
Also add missing validation in wasm-ctor-eval of the output (which makes
debugging this kind of thing a little easier).
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Before, a single ctor with GC worked, but any subsequent ones simply dropped
the globals from the previous ones, because we were missing an addGlobal in
an important place.
Also, we can get confused about which global names are in use in the module, so fix
that as well by storing them directly (we keep removing and re-adding globals, so
we can't use the normal module mechanism to find which names are in use).
|
|
|
|
| |
Like MemoryInit, this instruction cares about segment identity, so merging
segments into one big one for flattening is disallowed.
|
|
|
|
| |
Until we get full support for serializing table changes, stop evalling so we do
not break things.
|
| |
|
|
(#5510)
Simply loop over the values and use tuple.make.
This also adds a lit test for ctor-eval. I found that the problem blocking us
before was the logging, which confuses the update script. As this test at
least does not require that logging, this PR adds a --quiet flag that
disables the logging, and then a lit test just works.
|