| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
In asm2wasm we modelled debugInfo using special imports. And we tried
to not move them around much. Current debugInfo is tracked on
instructions and is not affected by removing this.
This may have some tiny effect beneficial effect on code size in debug
builds, perhaps.
|
|
|
|
| |
We don't use those effects now in any way, and if we need them some day
we can add them back. For now they just add overhead and complexity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for try-delegate in `EffectAnalyzer`. Without this
support, the expresion below has been incorrectly classified as "cannot
throw", because the previous code considered everything inside
`try`-`catch_all` as "cannot throw". This is not the case when there is
a `delegate` that can bypass the `catch_all`.
```wasm
try $l0
try
try
throw $e
delegate $l0
catch_all
end
end
|
|
|
|
|
|
|
|
|
| |
We marked that as only trapping if the input as nullable. But ref.as_func
will trap if it isn't a func, for example.
We could in theory try to check if a trap is possible, like checking if the
input is already non-nullable or already a function, etc., but we have
optimization passes to get rid of RefAs when they are not needed
anyhow, so there is no point to duplicate that here.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to what we do with structs, if a global is immutable then we know it
cannot interact with calls.
This changes the JS API for getSideEffects(). That was actually broken,
as passing in the optional module param would just pass it along to the
compiled C code, so it was coerced to 0 or 1, and not a pointer to a module.
To fix that, this now does module.ptr to actually get the pointer, and this is
now actually tested as without a module we cannot compute the effects of a
global. This PR also makes the module param mandatory in the JS API,
as again, without a module we can't compute global effects. (The module
param has already been mandatory in the C++ API for some time.)
|
| |
|
|
|
|
|
|
| |
This is the easy part of using immutability more: Just note immutable
fields as such when we read from them, and then a write to a struct
does not interfere with such reads. That is, only a read from a mutable
field can notice the effect of a write.
|
| |
|
| |
|
|
|
|
| |
Adds the part of the spec test suite that this passes (without table.set we
can't do it all).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An "intrinsic" is modeled as a call to an import. We could also add new
IR things for them, but that would take more work and lead to less
clear errors in other tools if they try to read a binary using such a
nonstandard extension.
A first intrinsic is added here, call.without.effects This is basically the same
as call_ref except that the optimizer is free to assume the call has no
side effects. Consequently, if the result is not used then it can be optimized
out (as even if it is not used then side effects could have kept it around).
Likewise, the lack of side effects allows more reordering and other
things.
A lowering pass for intrinsics is provided. Rather than automatically
lower them to normal wasm at the end of optimizations, the user must
call that pass explicitly. A typical workflow might be
-O --intrinsic-lowering -O
That optimizes with the intrinsic present - perhaps removing calls
thanks to it - then lowers it into normal wasm - it turns into a call_ref -
and then optimizes further, which would turns the call_ref into a
direct call, potentially inline, etc.
|
|
|
|
|
|
|
| |
array.init is like array.new_with_rtt except that it takes
as arguments the values to initialize the array with (as opposed to
a size and an optional initial value).
Spec: https://docs.google.com/document/d/1afthjsL_B9UaMqCA5ekgVmOm75BVFu6duHNsN9-gnXw/edit#
|
|
|
|
|
|
| |
This appeared to be a regression from #4117, however this
was always a bug, and that PR just exposed it. That is,
somehow we forgot to indicate the effects of ArrayCopy, and
after that PR we'd vacuum it out incorrectly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had already replaced the check on drop, but we can also
use that mode on all the other things there, as the pass never
does reorderings of things - it just removes them.
For example, the pass can now remove part of a dropped thing,
(drop (struct.get (foo)))
=>
(drop (foo))
In this example the struct.get can be removed, even if the foo
can't.
|
|
|
|
|
|
|
|
|
|
|
| |
This finishes the refactoring started in #4115 by doing the
same change to pass a Module into EffectAnalyzer instead of
features. To do so this refactors the fallthrough API and a few
other small things. After those changes, this PR removes the
old feature constructor of EffectAnalyzer entirely.
This requires a small breaking change in the C API, changing
BinaryenExpressionGetSideEffects's feature param to a
module. That makes this change not NFC, but otherwise it is.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Knowing the module will allow us to do more analysis in the
effect analyzer. For now, this just refactors the code to allow
providing a module instead of features, and to infer the
features from the module. This actually shortens the code in
most places which is nice (just pass module instead of
module->features).
This modifies basically all callers to use the new module
form, except for the fallthrough logic. That would require
some more refactoring, so to keep this PR reasonably small
that is not yet done.
|
|
|
|
|
|
|
|
|
|
|
| |
This allows common patterns in J2CL to be optimized, where we write
to various array indices and get the values or the reference from a
struct.
It would be nice to do even better here, and look at actually specific
types, but I think we should be careful to keep the runtime constant.
That seems hard to do if we accumulate a list of types and do
Type::isSubType on them etc. But maybe someone has a better
idea than this PR?
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The goal of this mode is to remove obviously-unneeded code like
(drop
(i32.load
(local.get $x)))
In general we can't remove it, as the load might trap - we'd be removing
a side effect. This is fairly rare in general, but actually becomes quite
annoying with wasm GC code where such patterns are more common,
and we really need to remove them.
Historically the IgnoreImplicitTraps option was meant to help here. However,
in practice it did not quite work well enough for most production code, as
mentioned e.g. in #3934 . TrapsNeverHappen mode is an attempt to fix that,
based on feedback from @askeksa in that issue, and also I believe this
implements an idea that @fitzgen mentioned a while ago (sorry, I can't
remember where exactly...). So I'm hopeful this will be generally useful
and not just for GC.
The idea in TrapsNeverHappen mode is that traps are assumed to not
actually happen at runtime. That is, if there is a trap in the code, it will
not be reached, or if it is reached then it will not trap. For example, an
(unreachable) would be assumed to never be reached, which means
that the optimizer can remove it and any code that executes right before
it:
(if
(..condition..)
(block
(..code that can be removed, if it does not branch out..)
(..code that can be removed, if it does not branch out..)
(..code that can be removed, if it does not branch out..)
(unreachable)))
And something like a load from memory is assumed to not trap, etc.,
which in particular would let us remove that dropped load from earlier.
This mode should be usable in production builds with assertions
disabled, if traps are seen as failing assertions. That might not be true
of all release builds (maybe some use traps for other purposes), but
hopefully in some. That is, if traps are like assertions, then enabling
this new mode would be like disabling assertions in release builds
and living with the fact that if an assertion would have been hit then
that is "undefined behavior" and the optimizer might have removed
the trap or done something weird.
TrapsNeverHappen (TNH) is different from IgnoreImplicitTraps (IIT).
The old IIT mode would just ignore traps when computing effects.
That is a simple model, but a problem happens with a trap behind
a condition, like this:
if (x != 0) foo(1 / x);
We won't trap on integer division by zero here only because of the
guarding if. In IIT, we'd compute no side effects on 1 / x, and then
we might end up moving it around, depending on other code in
the area, and potentially out of the if - which would make it happen
unconditionally, which would break.
TNH avoids that problem because it does not simply ignore traps.
Instead, there is a new hasUnremovableSideEffects() method
that must be opted-in by passes. That checks if there are no side
effects, or if there are, if we can remove them - and we know we can
remove a trap if we are running under TrapsNeverHappen mode,
as the trap won't happen by assumption. A pass must only use that
method where it is safe, that is, where it would either remove the
side effect (in which case, no problem), or if not, that it at least does
not move it around (avoiding the above problem with IIT).
This PR does not implement all optimizations possible with
TNH, just a small initial set of things to get started. It is already
useful on wasm GC code, including being as good as IIT on removing
unnecessary casts in some cases, see the test suite updates here.
Also, a significant part of the 18% speedup measured in
#4052 (comment)
is due to my testing with this enabled, as otherwise the devirtualization
there leaves a lot of unneeded code.
|
|
|
|
|
|
|
|
| |
Spec for it is here:
https://docs.google.com/document/d/1DklC3qVuOdLHSXB5UXghM_syCh-4cMinQ50ICiXnK3Q/edit#
Also reorder some things in wasm.h that were not in the canonical order (that has
no effect, but it is confusing to read).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Effects are fine in the moved code, if we are doing so on an if
(which runs just one arm anyhow).
Allow unreachable, which lets us hoist returns for example.
Allow none, which lets us hoist drop and call for example. For
this we also need to be careful with subtyping, as at least drop
is polymorphic, so the child types may not have an LUB (see
example in code).
Adds a small ShallowEffectAnalyzer child of EffectAnalyzer that
calls visit to just do a shallow analysis (instead of walk which
walks the children).
|
|
|
|
|
|
|
|
|
|
| |
(#3559)
The spec does not mention traps here, but this is like a JS VM trapping on
OOM - a runtime limitation is reached.
As these are not specced traps, I did not add them to effects.h. Note how
as a result the optimizer happily optimizes into a nop an unused allocation of an
array of size unsigned(-1), which is the behavior we want.
|
|
|
|
| |
Also removes experimental SIMD instructions that were not included in the final
spec proposal.
|
|
|
|
|
|
| |
We missed that in effects.h, with the result that sets could look like
they had no side effects.
Fixes #3754
|
|
|
|
|
|
|
|
|
|
|
| |
Just as reads and writes to memory can interfere with each other, reads and
writes of GC objects can interfere with each other. This PR adds new `readsHeap`
and `writesHeap` fields to EffectAnalyzer to account for this interference. Note
that memory accesses can never alias with GC heap accesses, so they are
considered separately. Similarly, it would be possible to prove that different
GC heap accesses never interfere with each other based on the accessed types,
but that's left to future work.
Fixes #3655.
|
|
|
|
|
|
|
|
| |
As proposed in https://github.com/WebAssembly/simd/pull/395. Note that the other
instructions in the proposal have not been implemented in LLVM or in V8, so
there is no need to implement them in Binaryen right now either. This PR
introduces a new expression class for the new instructions because they uniquely
take an immediate argument identifying which portion of the input vector to
widen.
|
|
|
|
|
|
|
|
| |
This expands the existing BrOnCast into BrOn that can also handle the
func/data/i31 variants. This is not as elegant as RefIs / RefAs in that BrOnCast
has an extra rtt field, but I think it is still the best option. We already have optional
fields on Break (the value and condition), so making rtt optional is not odd. And
it allows us to share all the behavior of br_on_* which aside from the cast or the
check itself, is identical - returning the value if the branch is not taken, etc.
|
|
|
|
|
|
|
|
| |
These are similar to is, but instead of returning an i32 answer, they trap on
an invalid value, and return it otherwise.
These could in theory be in a single RefDoThing, with opcodes for both As
and Is, but as the return values are different, that would be a little odd, and
the name would be less clear.
|
|
|
|
|
|
|
|
| |
This internal refactoring prepares us for ref.is_func/data/i31, by renaming
the node and adding an "op" field. For now that field must always be "Null"
which means it is a ref.is_null.
This adjusts the C API to match the new IR shape. The high-level JS API
is unchanged.
|
|
|
| |
This removes `exnref` type and `br_on_exn` instruction.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This updates `try`-`catch`-`catch_all` and `rethrow` instructions to
match the new spec. `delegate` is not included. Now `Try` contains not a
single `catchBody` expression but a vector of catch
bodies and events.
This updates most existing routines, optimizations, and tests modulo the
interpreter and the CFG traversal. Because the interpreter has not been
updated yet, the EH spec test is temporarily disabled in check.py. Also,
because the CFG traversal for EH is not yet updated, several EH tests in
`rse_all-features.wast`, which uses CFG traversal, are temporarily
commented out.
Also added a few more tests in existing EH test functions in
test/passes. In the previous spec, `catch` was catching all exceptions
so it was assumed that anything `try` body throws is caught by its
`catch`, but now we can assume the same only if there is a `catch_all`.
Newly added tests test cases when there is a `catch_all` and cases there
are only `catch`es separately.
|
|
|
|
| |
As proposed in https://github.com/WebAssembly/simd/pull/352, using the opcodes
used in the LLVM and V8 implementations.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tricky part here, as pointed out by aheejin in my previous attempt, is that
we need to know the type of the value we send if the branch is taken. We can
normally calculate that from the rtt parameter's type - we are casting to that
RTT, so we know what type that is - but if the rtt is unreachable, that's a problem.
To fix that, store the cast type on BrOnCast instructions.
This includes a test with a br_on_cast that succeeds and sends the cast value,
one that fails and passes through the uncast value, and also of one with an
unreachable RTT.
This includes a fix for Precompute, as noticed by that new test. If a break is
taken, with a ref as a value, we can't precompute it - for the same reasons
we can't precompute a ref in general, that it is a pointer to possibly shared
data.
|
|
|
|
| |
This adds enough to read and write them and test that, but leaves
interpreter support for later.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
array.new/get/set/len - pretty straightforward after structs and all the
infrastructure for them.
Also fixes validation of the unnecessary heapType param in the
text and binary formats in structs as well as arrays.
Fixes printing of packed types in type names, which emitted i32
for them. That broke when we emitted the same name for an array
of i8 and i32 as in the new testing here.
Also fix a bug in Field::operator< which was wrong for packed
types; again, this was easy to notice with the new testing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With struct.new read/write support, we can start to do interesting
things! This adds a test of creating a struct and seeing that references
behave like references, that is, if we write to the value X refers to, and
if Y refers to the same thing, when reading from Y's value we see the
change as well.
The test is run through all of -O1, which uncovered a minor issue in
Precompute: We can't try to precompute a reference type, as we can't
replace a reference with a value.
Note btw that the test shows the optimizer properly running
CoalesceLocals on reference types, merging two locals.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds rtt.canon and rtt.sub together with RTT type support
that is necessary for them. Together this lets us test roundtripping the
instructions and types.
Also fixes a missing traversal over globals in collectHeapTypes,
which the example from the GC docs requires, as the RTTs are in
globals there.
This does not yet add full interpreter support and other things. It
disables initial contents on GC in the fuzzer, to avoid the fuzzer
breaking.
Renames the binary ID for exnref, which is being removed from
the spec, and which overlaps with the binary ID for rtt.
|
|
|
|
|
|
|
|
|
|
| |
Mostly straightforward after struct.get.
This renames the value field in struct.get to ref. I think this makes
more sense because struct.set has both a reference to a thing, and a
value to set onto that thing. So calling the former ref seems more
consistent, giving us ref, value. This mirrors load/store for example
where we use ptr, value, and ref is playing the role of ptr here
basically.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first instruction that uses a GC Struct or Array, so it's where
we start to actually need support in the interpreter for those values, which
is added here.
GC data is modeled as a gcData field on a Literal, which is just a
Literals. That is, both a struct and an array are represented as an
array of values. The type which is alongside would indicate if it's a
struct or an array. Note that the data is referred to using a shared_ptr
so it should "just work", but we'll only be able to really test that once we
add struct.new and so can verify that references are by reference and
not value, etc.
As the first instruction to care about i8/16 types (which are only possible
in a Struct or Array) this adds support for parsing and emitting them.
This PR includes fuzz fixes for some minor things the fuzzer found, including
some bad printing of not having ResultTypeName in necessary places
(found by the text format roundtripping fuzzer).
|
|
|
|
|
|
|
|
| |
Includes minimal support in various passes. Also includes actual optimization
work in Directize, which was easy to add.
Almost has fuzzer support, but the actual makeCallRef is just a stub so far.
Includes s-parser support for parsing typed function references types.
|
|
|
|
|
|
|
|
|
|
|
| |
We did not really model the effects of unreachable properly before. It
always traps, so it's not an implicit trap, but we didn't do anything but mark
it as "branches out", which is not really enough, as while yes it does
branch inside the current function, it also traps which is noticeable outside.
To fix that, add a trap effect to track this. implicitTrap will set trap as well,
automatically, if we do not ignore implicit traps, so it is enough to check just
that (unless one cares about the difference between implicit and explicit
ones).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#3385)
A user of EffectAnalyzer could call walk or visit, to walk the entire
input or just visit the node without chlidren. But this was unsafe - we just
exposed the Walker/Visitor API here, and did not ensure that users did
the stuff in analyze which does a little stuff before and after. In fact
Vacuum got this wrong.
To avoid that, move all the internals to an internal class. The external
API now only allows the caller to call walk or visit, and both are safe.
The change here is mostly whitespace + adding parent. prefixes.
This is NFC except for fixing possible Vacuum issues (which I am not sure
could happen in practice or not).
|
|
|
|
|
|
|
|
| |
The new writesGlobalState has a name that more clearly indicates what it
actually does: affect global state (that is, memory, globals, the table, etc.).
This removes throw from there, and handles it directly in the single caller
of the method, the licm pass. For simplicity, disallow exceptions in that
pass, leaving it for future work.
|
|
|
|
|
|
|
| |
Division and remainder do not have an implicit trap if the
right-hand side is a constant and not one of the dangerous
values there.
Also refactor ignoreImplicitTrap handling for clarity.
|
|
|
|
|
|
|
| |
These instructions are proposed in https://github.com/WebAssembly/simd/pull/350.
This PR implements them throughout Binaryen except in the C/JS APIs and in the
fuzzer, where it leaves TODOs instead. Right now these instructions are just
being implemented for prototyping so adding them to the APIs isn't critical and
they aren't generally available to be fuzzed in Wasm engines.
|
|
|
| |
NFC, except adding most of the boilerplate for the remaining GC instructions. Each implementation site is marked with a respective `TODO (gc): theInstruction` in between the typical boilerplate code.
|
|
|
| |
Adds the `i31.new` and `i31.get_s/u` instructions for creating and working with `i31ref` typed values. Does not include fuzzer integration just yet because the fuzzer expects that trivial values it creates are suitable in global initializers, which is not the case for trivial `i31ref` expressions.
|
|
|
| |
With `eqref` now integrated, the `ref.eq` instruction can be implemented. The only valid LHS and RHS value is `(ref.null eq)` for now, but implementation and fuzzer integration is otherwise complete.
|
|
|
| |
Aligns the internal representations of `memory.size` and `memory.grow` with other more recent memory instructions by removing the legacy `Host` expression class and adding separate expression classes for `MemorySize` and `MemoryGrow`. Simplifies related APIs, but is also a breaking API change.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We marked it as readsMemory so that it could be reordered with various
things, except for memory.init. However, the fuzzer found that's not quite
right, as it has a global side effect - memory.inits that run later can notice
that. So it can't be reordered with anything that might affect global side
effects from happening, as in the testcase added here (an instruction that
may trap cannot be reordered with a data.drop, as it may prevent the
data.drop from happening and changing global state).
There may be a way to optimize this more carefully that would allow more
optimizations, but as this is a rare instruction I'm not sure it's worth more
work.
|