| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
WTF-16, i.e. arbitrary sequences of 16-bit values, is the encoding of Java and
JavaScript strings, and using the same encoding makes the interpretation of
string operations trivial, even when accounting for non-ascii characters.
Specifically, use little-endian WTF-16.
Re-encode string constants from WTF-8 to WTF-16 in the parsers, then back to
WTF-8 in the writers. Update the constructor for string `Literal`s to interpret
the string as WTF-16 and store a sequence of WTF-16 code units, i.e. 16-bit
integers. Update `Builder::makeConstantExpression` accordingly to convert from
the new `Literal` string representation back to a WTF-16 string.
Update the interpreter to remove the logic for detecting non-ascii characters
and bailing out. The naive implementations of all the string operations are
correct now that our string encoding matches the JS string encoding.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only StringEncode we support is the one that writes into an array, so it
has the same effects as ArrayCopy. Precompute needs to be made aware of
such side effects in a manual manner (as we already do for ArrayCopy etc.):
it simply tries to execute code in the interpreter, and if it succeeds it replaces;
it does not check for side effects (checking for side effects would prevent
optimizing cases where the side effects do not happen, as we check them
statically, e.g. dividing by a non-zero constant does not trap but a division
would be seen as having a potential trap effect).
I verified no other string operation is hit by this: all the others
emit or operate on immutable strings; it is just StringEncode that is basically
an Array operation that appears in the Strings proposal.)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pass does (among other things) this:
(if
condition
X
X
)
=>
(block
(drop
condition
)
X ;; deduplicated
)
After that the condition is now nested in a block, so we may need EH fixups
if it contains a pop.
|
|
|
| |
This omission was able to cause a problem with text round-tripping.
|
| |
|
| |
|
|
|
|
|
| |
properly (#6415)
See WebAssembly/stringref#66
|
|
|
|
|
| |
Like JS string slicing, if the end index is out of bounds that is fine, we clamp to the end.
This also matches the behavior in V8 and the spec.
|
|
|
|
|
|
|
|
| |
This reverts commit 70ac213fce134840609190a5d3a18118a089ba8a.
Reverts #6412
On second thought we found a way to make fixing this less urgent, and the
code size downsides of this are worrying, so let's revert it.
|
|
|
|
| |
Our UTF implementation is still not fully stable it seems as we have reports of
issues. Disable it for now.
|
| |
|
|
|
|
|
|
|
|
|
| |
The interpreter does not run multiple threads, and it was returning 0 from
atomic.wait, which means it was woken up. But it is more correct for it to
return 2, which means it timed out - which is actually the case, as no other
thread exists that can wake it up. However, even that is not good for fuzzing
as the timeout may be infinite or large, so just emit a host limit error on any
timeout for now, until we actually implement threads.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR is part of a series that adds basic support for the [typed
continuations/wasmfx proposal](https://github.com/wasmfx/specfx).
This particular PR adds support for the `suspend` instruction for suspending
with a given tag, documented
[here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions).
These instructions are of the form `(suspend $tag)`. Assuming that `$tag` is
defined with _n_ `param` types `t_1` to `t_n`, the instruction consumes _n_
arguments of types `t_1` to `t_n`. Its result type is the same as the `result`
type of the tag. Thus, the folded textual representation looks like
`(suspend $tag arg1 ... argn)`.
Support for the instruction is implemented in both the old and the new wat
parser.
Note that this PR does not implement validation of the new instruction.
This PR also fixes finalization of `cont.new`, `cont.bind` and `resume` nodes in
those cases where any of their children are unreachable.
|
|
|
|
|
|
|
| |
Our interpreter implementations of `stringview_wtf16.length`,
`stringview_wtf16.get_codeunit`, and `string.encode_wtf16_array` are not
unicode-aware, so they were previously incorrect in the face of multi-byte code
units. As a fix, bail out of the interpretation if there is a non-ascii code
point that would make our naive implementation incorrect.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar issue as: #6330
FAILED: src/passes/CMakeFiles/passes.dir/Precompute.cpp.o
/usr/bin/c++ -I/build/binaryen/src/binaryen-version_117/src -I/build/binaryen/src/binaryen-version_117/third_party/llvm-project/include -I/build/binaryen/src/binaryen-version_117/build -march=rv64gc -mabi=lp64d -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -Wp,-D_GLIBCXX_ASSERTIONS -g -ffile-prefix-map=/build/binaryen/src=/usr/src/debug/binaryen -DBUILD_LLVM_DWARF -Wall -Werror -Wextra -Wno-unused-parameter -Wno-dangling-pointer -fno-omit-frame-pointer -fno-rtti -Wno-implicit-int-float-conversion -Wno-unknown-warning-option -Wswitch -Wimplicit-fallthrough -Wnon-virtual-dtor -fPIC -fdiagnostics-color=always -O3 -DNDEBUG -UNDEBUG -std=c++17 -MD -MT src/passes/CMakeFiles/passes.dir/Precompute.cpp.o -MF src/passes/CMakeFiles/passes.dir/Precompute.cpp.o.d -o src/passes/CMakeFiles/passes.dir/Precompute.cpp.o -c /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp
In file included from /build/binaryen/src/binaryen-version_117/src/wasm-traversal.h:30,
from /build/binaryen/src/binaryen-version_117/src/pass.h:24,
from /build/binaryen/src/binaryen-version_117/src/ir/intrinsics.h:20,
from /build/binaryen/src/binaryen-version_117/src/ir/effects.h:20,
from /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp:30:
In copy constructor ‘wasm::SmallVector<wasm::Expression*, 10>::SmallVector(const wasm::SmallVector<wasm::Expression*, 10>&)’,
inlined from ‘constexpr std::pair<_T1, _T2>::pair(const _T1&, const _T2&) [with _U1 = wasm::Select* const; _U2 = wasm::SmallVector<wasm::Expression*, 10>; typename std::enable_if<(std::_PCC<true, _T1, _T2>::_ConstructiblePair<_U1, _U2>() && std::_PCC<true, _T1, _T2>::_ImplicitlyConvertiblePair<_U1, _U2>()), bool>::type <anonymous> = true; _T1 = wasm::Select* const; _T2 = wasm::SmallVector<wasm::Expression*, 10>]’ at /usr/include/c++/13.2.1/bits/stl_pair.h:559:21,
inlined from ‘T& wasm::InsertOrderedMap<Key, T>::operator[](const Key&) [with Key = wasm::Select*; T = wasm::SmallVector<wasm::Expression*, 10>]’ at /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h:112:29:
/build/binaryen/src/binaryen-version_117/src/support/small_vector.h:42:38: error: ‘<unnamed>.wasm::SmallVector<wasm::Expression*, 10>::fixed’ is used uninitialized [-Werror=uninitialized]
42 | template<typename T, size_t N> class SmallVector {
| ^~~~~~~~~~~
In file included from /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp:38:
/build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h: In function ‘T& wasm::InsertOrderedMap<Key, T>::operator[](const Key&) [with Key = wasm::Select*; T = wasm::SmallVector<wasm::Expression*, 10>]’:
/build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h:112:29: note: ‘<anonymous>’ declared here
112 | std::pair<const Key, T> kv = {k, {}};
| ^~
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
effects (#6395)
Before this PR, when we saw a param was unused we sometimes could not remove it.
For example, if there was one call like this:
(call $target
(call $other)
)
That nested call has effects, so we can't just remove it from the outer call - we'd need to
move it first. That motion was hard to integrate which was why it was left out, but it
turns out that is sometimes very important. E.g. in Java it is common to have such calls
that send the this parameter as the result of another call; not being able to remove such
params meant we kept those nested calls alive, creating empty structs just to have
something to send there.
To fix this, this builds on top of #6394 which makes it easier to move all children out of
a parent, leaving only nested things that can be easily moved around and removed. In
more detail, DeadArgumentElimination/SignaturePruning track whether we run into effects that
prevent removing a field. If we do, then we queue an operation to move the children
out, which we do using a new utility ParamUtils::localizeCallsTo. The pass then does
another iteration after that operation.
Alternatively we could try to move things around immediately, but that is quite hard:
those passes already track a lot of state. It is simpler to do the fixup in an entirely
separate utility. That does come at the cost of the utility doing another pass on the
module and the pass itself running another iteration, but this situation is not the most
common.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
We incorrectly overrode the string operations in the interpreter's subclasses. But
string operations can be implemented in the topmost class there (as they depend on
no module state), so just implement them there, once, in a proper way.
This fixes StringEq by removing its override, and moves the others to the right
place.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is NFC in the current users, but is necessary functionality for a later
PR.
ChildLocalizer moves children into locals as needed. It used to stop when it
saw the first unreachable. After this change we move such unreachable
children out of the parent as well, making this more uniform: all interacting
effects are moved out, and all that is left nested in the parent can be
moved around and removed as desired.
Also add a getReplacement helper that makes using this easier.
This cannot be tested comprehensively with the current user as that user
will not call this code path on an unreachable parent at all, so this just
adds what can be tested. The later PR will have tests for all corner cases.
|
| |
|
|
|
|
|
| |
Add `[[maybe_unused]]` to variables that are only used in assertions. In builds
without assertions enabled, these were causing compiler errors about unused
variables.
|
|
|
|
|
| |
It was previously possible to opt in to using the legacy GC opcodes with a build
time flag. Now that WasmGC has shipped and users have migrated to the standard
opcodes, remove the option to use the legacy encodings.
|
|
|
|
| |
Previously selects finalized with explicit types would never be marked
unreachable, even when they should have been.
|
|
|
|
|
|
|
| |
When instructions cannot be printed because the children from which they are
supposed to get their type immediates are unreachable or null, we print blocks
of their dropped children followed by unreachables. But the logic for making
this happen was more complicated than necessary and in fact included dead code.
Clean it up.
|
|
|
|
|
|
|
|
|
| |
When the bulk array ops had unreachable or null array types, they were replaced
with blocks, but not using the correct code that also prints all their children
as dropped followed by an unreachable. This meant that the text output in those
cases did not parse as a valid module.
Fix the bug. A follow-up PR will simplify the code to prevent similar bugs from
occurring in the future.
|
|
|
|
| |
Throw errors if tuple arity immediates are less than 2 or if tuple index
immediates are out of bounds.
|
|
|
|
| |
This allows reading a module that requires a particular
feature set. The old API assumed only MVP features.
|
|
|
|
|
|
| |
The fuzzer already had logic to remove all references to non-imported globals
from global initializers and data segment offsets, but it was missing for
element segment offsets. Add it, and also add a missing check line for the new
test that uncovered this bug as initial fuzzer input.
|
|
|
|
|
| |
Due to a typo, the fuzzer was making externrefs when it should have been making
exnrefs. Fix that and also let eh-utils.cpp know that TryTable exists to avoid
an assertion failure.
|
|
|
|
|
|
|
| |
Previously we just printed the offset instruction(s) directly, which is a valid
shorthand only when there is a single instruction. In the case of extended
constant instructions, there can potentially be multiple instructions, in which
case the explicit `offset` clause is required. Print the full clause when
necessary.
|
| |
|
|
|
|
|
| |
Rather than reassembling a tuple from multiple pops, let the pop implementation
assemble the tuple. This produces less code in cases where there is already a
tuple of the proper size on top of the stack. It also simplifies the code.
|
|
|
|
|
|
|
|
|
| |
Add a pass that propagates debug locations to unannotated child and sibling
expressions after parsing. The new parser on its own only attaches debug
locations to directly annotated instructions, but this pass, which we run
unconditionally, emulates the behavior of the previous parser for compatibility
with existing programs. It does unintuitive things to programs using the
non-nested format because it runs on nested Binaryen IR, so we may want to
rethink this at some point.
|
|
|
|
|
|
|
| |
and fix a bug with sourcemap annotations on folded `if` conditions. Update
IRBuilder to apply prologue and epilogue source locations when beginning and ending
a function scope. Add basic support in the parser for explicitly tracking
annotations on module fields, although only do anything with them in the case of
prologue source location annotations.
|
|
|
| |
See #6373
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR is part of a series that adds basic support for the [typed
continuations/wasmfx proposal](https://github.com/wasmfx/specfx).
This particular PR adds support for the `cont.bind` instruction for partially
applying continuations, documented
[here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions).
In short, these instructions are of the form `(cont.bind $ct_before $ct_after)`
where `$ct_before` and `$ct_after` are related continuation types. They must
only differ in the number of arguments, where `$ct_before` has _n_ additional
parameters as compared to `$ct_after`, for some _n_ ≥ 0. The idea is that
`(cont.bind $ct_before $ct_after)` then takes a reference to a continuation of
type `$ct_before` as well as _n_ operands and returns a (reference to a)
continuation of type `$ct_after`. Thus, the folded textual representation looks
like `(cont.bind $ct_before $ct_after arg1 ... argn c)`.
Support for the instruction is implemented in both the old and the new wat
parser.
Note that this PR does not implement validation of the new instruction.
|
| |
|
|
|
|
| |
This new form of the abbreviated memory declaration with inline data is
introduced in the memory64 proposal.
|
|
|
|
|
| |
We previously required a memory to exist while parsing all `StringNew` and
`StringEncode` instructions, even though some variants of the instructions use
GC arrays instead. Require a memory only for those instructions that use one.
|
|
|
|
|
|
|
|
|
|
| |
In some cases we don't print an Expression in full if it is unreachable, so
we print something instead as a placeholder. This happens in unreachable
code when the children don't provide enough info to print the parent (e.g.
a StructGet with an unreachable reference doesn't know what struct type
to use).
This PR prints out the name of the Expression type of such things, which
can help debugging sometimes.
|
|
|
| |
Fixes #6314.
|
|
|
|
|
|
|
| |
(#6359)
I audited all of SubtypingDiscoverer for flow/non-flow constraints and added
some comments to clarify things for our future selves if we ever need to
generalize it.
|
|
|
| |
Adds new visitBreakWithType and visitSwitchWithType functions to the IRBuilder API. These functions work around an assumption in IRBuilder that the module is being traversed in the fully nested format, i.e., that the destination scope of a break or switch has been visited before visiting the break or switch. Instead, the type of the destination scope is passed to IRBuilder.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we do a local.set of a value into a local then we have both a subtyping constraint - for
the value to be valid to put in that local - and also a flow of a value, which can then reach
more places. Such flow then interacts with casts in Unsubtyping, since it needs to know
what can flow where in order to know how casts force us to keep subtyping relations.
That regressed in the not-actually-NFC #6323 in which I added the innocuous lines
to add subtyping constraints in ref.eq. It seems fine to require that the arms of a
RefEq must be of type eqref, but Unsubtyping then assuming those arms flowed into
a location of type eqref... which means casts might force us to not optimize some
things.
To fix this, differentiate the rare case of non-flowing subtyping constraints, which is
basically only RefEq. There are perhaps a few more cases (like i31 operations) but they
do not matter in practice for Unsubtyping anyhow; I suggest we land this first to undo
the regression and then at our leisure investigate the other instructions.
|
| |
|
|
|
|
| |
Previously we lowered this to `getCodePointAt`, which has different semantics
around surrogate pairs.
|
|
|
|
|
|
|
|
|
|
| |
Parse annotations using the standards-track `(@annotation ...)` format as well
as the `;;@ source-map:0:1` format. Have the lexer implicitly collect
annotations while it skips whitespace and add lexer APIs to access the
annotations since the last token was parsed. Collect annotations before parsing
each instruction and pass the annotations explicitly to the parser and parser
context functions for instructions. Add an API to `IRBuilder` to set a debug
location to be attached to the next visited or created instruction and use it
from the parser.
|