| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
| |
Parse loops in the new wat parser and add support for them to the IRBuilder.
|
|
|
|
|
|
| |
Somewhat counterintuitively, the text syntax for a folded `if` allows any number
of folded instructions in the condition position, not just one. Update the
corresponding `foldedinsts` parsing function to parse arbitrary sequences of
folded instructions and add a test.
|
|
|
|
|
|
|
|
|
|
|
| |
The new wat parser previously returned InstrT types when parsing individual
instructions and collected InstrsT types when parsing sequences of instructions.
However, instructions were always actually tracked in the internal state of the
parsing context, so these types never held any interesting or necessary data.
Simplify the parser by removing these types and leaning into the pattern that
the parser context will keep track of parsed instructions.
This allows for a much cleaner separation between the `instrs` and
`foldedinstrs` parser functions.
|
|
|
|
|
|
| |
Parse both the straight-line and folded versions of if, including the
abbreviations that allow omitting the else clause. In the IRBuilder, generalize
the scope stack to be able to track scopes other than blocks and add methods for
visiting the beginnings of ifs and elses.
|
|
|
|
|
| |
Probably any array of non-reference data can be allowed to be public and sent
out of the module, as it is just data. For now, however, just special case the i8
and i16 array types which are useful already for string interop.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g.
(local $x (ref eq)
...
(local.set $x
(struct.new $float
...
)
)
(struct.get $float 0
(ref.cast (ref $float)
(local.get $x)
)
)
This PR allows us to use heap2local, ignoring the passing cast.
This is similar to existing handling of ref.as_non_null.
|
|
|
|
|
|
| |
Before in getType() we silently dropped the params of a signature type. Now we verify that
it is none, or we error.
Helps #5950
|
|
|
|
| |
NFC, but fixes a current fuzz bug on table.fill not having an entry in this file. After
this PR, there is no need for such entries.
|
|
|
|
|
|
| |
And put the new files in a new source directory, "parser". This is a rough split
and is not yet expected to dramatically improve compile times. The exact
organization of the new files is subject to change, but this splitting should be
enough to make further parser development more pleasant.
|
|
|
|
|
|
|
|
|
| |
In general, the binary lowering of tuple.extract expects that all the tuple
values are on top of the stack, so it inserts drops and possibly uses a scratch
local to ensure only the extracted value is left. However, when the extracted
tuple expression is a local.get, local.tee, or global.get, it's much more
efficient to change the lowering of the get or tee to ensure that only the
extracted value is on the stack to begin with. Implement that optimization in
the binary writer.
|
|
|
|
| |
This Stack IR optimization is not compatible with a much more powerful
optimization we plan to do for tuples in the binary writer.
|
|
|
|
|
|
|
|
|
|
| |
Visiting a block should push it onto the stack just like visiting any other
expression, but we previously had a `visitBlock` that introduced a new scope
instead. Fix `visitBlock` to behave as expected and introduce a new
`visitBlockStart` method to introduce a new scope.
Unfortunately this cannot be fully tested yet because the wat parser uses the
`makeXYZ` API intead of the `visit` API, but at least I updated `makeBlock` to
call `visitBlockStart`, so that is tested.
|
|
|
|
|
|
|
|
|
| |
TypeFinalization finalizes all types that we can, that is, all private types that have no
children. TypeUnFinalization unfinalizes (opens) all (private) types.
These could be used by first opening all types, optimizing, and then finalizing, as that
might find more opportunities.
Fixes #5933
|
| |
|
|
|
| |
table.fill requires bulk memory to be enabled, not reference types.
|
|
|
|
|
|
|
|
| |
This instruction was standardized as part of the bulk memory proposal, but we
never implemented it until now. Leave similar instructions like table.copy as
future work.
Fixes #5939.
|
|
|
|
|
|
|
| |
Remove support for the "struct_subtype", "array_subtype", "func_subtype", and
"extends" notations we used at various times to declare WasmGC types, leaving
only support for the standard text fromat for declaring types. Update all the
tests using the old formats and delete tests that existed solely to test the old
formats.
|
|
|
|
|
| |
This reverts commit 56ce1eaba7f500b572bcfe06e3248372e9672322. The binary writer
optimization is not always correct when stack IR optimizations have run. Revert
the change until we can fix it.
|
|
|
|
|
|
|
|
|
| |
In general, the binary lowering of tuple.extract expects that all the tuple
values are on top of the stack, so it inserts drops and possibly uses a scratch
local to ensure only the extracted value is left. However, when the extracted
tuple expression is a local.get, local.tee, or global.get, it's much more
efficient to change the lowering of the get or tee to ensure that only the
extracted value is on the stack to begin with. Implement that optimization in
the binary writer.
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases tuples are obviously not needed, such as when they are only used
in local operations and make/extract. Such tuples are not used as return values or
in control flow structures, so we might as well lower them to individual locals per
lane, which other passes can optimize a lot better.
I believe LLVM does the same with its own tuples: it lowers them as much as
possible, leaving only necessary ones.
Fixes #5923
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR changes how file paths and the command line are handled. On startup on Windows,
we process the wstring version of the command line (including the file paths) and re-encode
it to UTF8 before handing it off to the rest of the command line handling logic. This means
that all paths are stored in UTF8-encoded std::strings as they go through the program, right
up until they are used to open files. At that time, they are converted to the appropriate native
format with the new to_path function before passing to the stdlib open functions.
This has the advantage that all of the non-file-opening code can use a single type to hold paths
(which is good since std::filesystem::path has proved problematic in some cases), but has the
disadvantage that someone could add new code that forgets to convert to_path before
opening. That's somewhat mitigated by the fact that most of the code uses the ModuleIOBase
classes for opening files.
Fixes #4995
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g.
(tuple.extract 1
(tuple.make (A) (B) (C))
=>
(B)
Modify some existing tests to not be in this trivial form, so that they do not
stop testing what they should.
|
|
|
|
|
| |
Replace i31.new with ref.i31 in the printer, tests, and source code. Continue
parsing i31.new for the time being to allow a graceful transition. Also update
the JS API to reflect the new instruction name.
|
|
|
|
| |
Fixes #5928 , on FreeBSD off_t is not defined in the headers we include.
|
|
|
|
|
|
|
|
| |
Globally replace the source string "I31New" with "RefI31" in preparation for
renaming the instruction from "i31.new" to "ref.i31", as implemented in the spec
in https://github.com/WebAssembly/gc/pull/422. This would be NFC, except that it
also changes the string in the external-facing C APIs.
A follow-up PR will make the corresponding behavioral change.
|
|
|
|
| |
The legacy encodings remain available for now by defining
USE_LEGACY_GC_ENCODINGS at build time.
|
|
|
|
| |
Remove the old forms of ref.test and ref.cast that took heap types instead of
ref types and remove the old array.init_static name for array.new_fixed.
|
|
|
|
|
|
| |
Previously, the printer incorrectly reconstructed imported functions' types from
their signatures instead of printing their types directly. This could cause the
printer to print uses of types that were never defined and did not exist in the
module. Fix the bug by printing imported functions' heap types directly.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Match the spec and parse the shorthand binary and text formats as final and emit
final types without supertypes using the shorthands as well. This is a
potentially-breaking change, since the text and binary shorthands can no longer
be used to define types that have subtypes.
Also make TypeBuilder entries final by default to better match the spec and
update the internal APIs to use the "open" terminology rather than "final"
terminology. Future changes will update the text format to use the standard "sub
open" rather than the current "sub final" keywords. The exception is the new wat
parser, which supporst "sub open" as of this change, since it didn't support
final types at all previously.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Almost entirely trivial, except for this part:
- if (nextDebugLocation.availablePos == 0 &&
- nextDebugLocation.previousPos <= pos) {
+ if (nextDebugLocation.availablePos == 0) {
return;
I believe removing the extra check has no effect. Removing it does not change
anything in the test suite, and logically, if we set availablePos to 0 then we
definitely want to return here - we set it to 0 to indicate there is nothing left
to read, which is what the code after it does.
As a result, we can remove the previousPos field entirely.
|
|
|
|
|
|
| |
Copy the old expression's debug info if the new has none. But if the new has
its own, trust that.
Followup to #5914
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic there assumed that we are removing the current node and replacing it
with the given one, so it copied debug info to the new one and deleted it for the
old. But the old one might now be a child of the new one, if we reordered, so we
were dropping debug info, in particular in MergeBlocks which reorders like
this:
(call
(block ..
=>
(block
(call
(it moves blocks outwards so it can merge them).
|
|
|
|
|
| |
Now that the WasmGC spec has settled on a way of validating non-nullable locals,
we no longer need this experimental feature that allowed nonstandard uses of
non-nullable locals.
|
|
|
|
|
|
|
| |
In the binary parser, when creating a scratch local to hold multivalue results
as tuples, we previously ensured that the scratch local did not contain any
non-nullable by modifying its type and inserting ref.as_non_null as necessary.
Now that we properly support non-nullable elements in tuple locals, however,
this parser behavior is no longer necessary. Remove it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we did nothing for instructions without debug info. So if we had one
that did, followed by one that didn't, the one that didn't could get "smeared" with
the debug info of the first. Source map locations are the start of segments,
apparently, and so if we say a location has info then all others after it will as well,
until the next segment.
To fix that, add support for source map entries with just one field, the binary
location. Such entries have no debug info (no file:line:col), and though the source
maps spec is not very clear on this, this seems like the right way to prevent this
problem: to stop a segment with debug info by starting a new one without, when
we know we don't want that info any more. That is, before this PR we could have
this:
;; file.cpp:10:1
(nop) ;; binary offset 5 in wasm
(nop) ;; binary offset 6 in wasm
and the second nop would get the same debug annotation, since we just have
one source map segment,
[5, file.cpp, 10, 1] // start at offset 5 in wasm
With this PR, we emit:
[5, file.cpp, 10, 1] // start at offset 5 in wasm; file.cpp:10:1
[6] // start at offset 6 in wasm; no debug info
This does add 7% to source map sizes, however, since we add those 1-length
segments now, but that seems unavoidable to fix this bug.
To implement this, add a new field that says if the next location in the source map
has debug info or not, and use that.
|
|
|
|
|
|
| |
The code validating and fixing up non-nullable locals previously did not
correctly handle tuples that contained non-nullable elements, which could have
resulted in invalid modules going undetected. Update the code to handle tuples
and add tests.
|
|
|
|
|
|
|
|
| |
In general, full print mode should print out all the things to avoid confusion. It
already did so for blocks (that the text format sometimes elides), types, etc. Doing
it for debug info can avoid confusion when debugging (in fact, this was one of the
main reasons I've been confused about how source maps work in Binaryen...).
Also add a comment to the code just landed in #5903
|
|
|
|
| |
This bug was found by fuzzing Binaryen and V8 together with the standard GC
encodings enabled.
|
|
|
|
|
|
| |
Skip repeated identical debug info only of more-nested nodes. Before this PR we
skipped sibling nodes and even parent nodes, which could be confusing. After
this PR there is a more clear connection: child nodes have the same debug location
as the parent, by default, and so there is no need to print it again.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial PR introducing IRBuilder kept the interface the same as the previous
internal interface in the new wat parser. This PR updates that interface to
avoid exposing implementation details of the IRBuilder and to provide an API
that matches the binary format. For example, after calling `makeBlock` or
`visitBlock` at the beginning of a block, users now call `visitEnd()` at the end
of the block without having to manually install the block's contents.
Providing this improved interface requires refactoring some of the IRBuilder
internals. While we are refactoring things anyway, put in extra effort to avoid
unnecessarily splitting up and recombining tuples that could simply be returned
from a multivalue block.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When printing Binaryen IR, we previously generated names for unnamed heap types
based on their structure. This was useful for seeing the structure of simple
types at a glance without having to separately go look up their definitions, but
it also had two problems:
1. The same name could be generated for multiple types. The generated names did
not take into account rec group structure or finality, so types that differed
only in these properties would have the same name. Also, generated type names
were limited in length, so very large types that shared only some structure
could also end up with the same names. Using the same name for multiple types
produces incorrect and unparsable output.
2. The generated names were not useful beyond the most trivial examples. Even
with length limits, names for nontrivial types were extremely long and visually
noisy, which made reading disassembled real-world code more challenging.
Fix these problems by emitting simple indexed names for unnamed heap types
instead. This regresses readability for very simple examples, but the trade off
is worth it.
This change also reduces the number of type printing systems we have by one.
Previously we had the system in Print.cpp, but we had another, more general and
extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the
old type printing system from Print.cpp and replace it with a much smaller use
of the new system. This requires significant refactoring of Print.cpp so that
PrintExpressionContents object now holds a reference to a parent
PrintSExpression object that holds the type name state.
This diff is very large because almost every test output changed slightly. To
minimize the diff and ease review, change the type printer in wasm-type.cpp to
behave the same as the old type printer in Print.cpp except for the differences
in name generation. These changes will be reverted in much smaller PRs in the
future to generally improve how types are printed.
|
|
|
|
|
|
|
|
|
| |
Previously it was possible that the supertype merging phase would merge
unrelated types when DFA minimization would split a common supertype out of a
partition, leaving unrelated types behind in the same partition. Fix the problem
by post-processing the partitions in the supertype merging phase to split any
partitions that contain unrelated types.
Fixes #5877.
|
|
|
|
|
|
|
| |
If we refine a signature type that is used in a call.without.effects then that call's
results may need to be updated. In the IR it looks like a normal call that happens to
pass a function reference as the last param, but it actually means that we call that
function (without side effects), so we need to have the same results, and the validator
already verified that (so the new testcase here fails without this fix).
|
|
|
|
|
|
|
|
|
| |
* Update text output for `ref.cast` and `ref.test`
* Update text output for `array.new_fixed`
* Update tests with new syntax for `ref.cast` and `ref.test`
* Update tests with new `array.new_fixed` syntax
|
|
|
|
|
|
|
|
| |
The improvements to RemoveUnusedBrs in #5887 also introduced a regression where
the pass did not correctly handle unreachable fallthrough values and crashed
with an assertion failure. Fix the problem by returning early when a fallthrough
value is unreachable and add a regression test.
Fixes #5892.
|
|
|
|
|
|
| |
In practice we don't need high addresses, and when they happen the current
implementation can OOM, so exit early on them instead.
Fixes #5893
|
|
|
|
|
|
|
|
|
|
|
| |
* Allow new syntax for some stringref opcodes
Fixes #5607
* Update stringref text output
* Update tests with new syntax for stringref opcodes
Except in test/lit/strings.wat, to check that the legacy syntax still works.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an IRBuilder utility in a new wasm-ir-builder.h header. IRBuilder is
extremely similar to Builder, except that it manages building full trees of
Binaryen IR from a linear sequence of instructions, whereas Builder only builds
a single IR node at a time. To build full IR trees, IRBuilder maintains an
internal stack of expressions, popping children off the stack and pushing the
new node onto the stack whenever it builds a new node.
In addition to providing makeXYZ function to allocate, initialize, and finalize
new IR nodes, IRBuilder also provides a visit() method that can be used when the
user has already allocated the IR nodes and only needs to reconstruct the
connections between them. This will be useful in outlining both for constructing
outlined functions and for reconstructing functions around arbitrary outlined
holes.
Besides the new wat parser and outlining, this new utility can also eventually
be used in the binary parser and to convert from Poppy IR back to Binaryen IR if
that ever becomes necessary.
To simplify this initial change, IRBuilder exposes the same interface as the
code it replaces in the wat parser. A future change requiring more extensive
changes to the wat parser will simplify this interface. Also, since the new code
is tested only via the new wat parser, it only supports building instructions
that were already supported by the new wat parser to avoid trying to support any
instructions without corresponding testing. Implementing support for the
remaining instructions is left as future work.
|
|
|
|
|
|
|
|
|
|
|
| |
* Allow empty `then` and `else` clauses
* Allow standard syntax for `ref.test` and `ref.cast`
Fixes #5795
* Allow size immediate in `array.new_fixed`
Fixes #5769
|