| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code there looks for a "sign-extend": (x << a) >> b where the
right shift is signed. If a = b = 24 for example then that is a sign
extend of an 8-bit value (it works by shifting the 8-bit value's sign bit
to the position of the 32-bit value's sign bit, then shifting all the way
back, which fills everything above 8 bits with the sign bit). The tricky
thing is that in some cases we can handle a != b - but we forgot a
place to check that. Specifically, a repeated sign-extend is not
necessary, but if the outer one has extra shifts, we can't do it.
This is annoyingly complex code, but for purposes of reviewing this
PR, you can see (unless I messed up) that the only change is to
ensure that when we look for a repeated sign extend, then we
only optimize that case when there are no extra shifts. And a
repeated sign-extend is obviously ok to remove,
(((x << a) >> a) << a) >> a => (x << a) >> a
This is an ancient bug, showing how hard it can be to find certain
patterns either by fuzzing or in the real world...
Fixes #3362
|
|
|
|
| |
unreachable (#3413)
|
|
|
|
|
|
|
|
|
|
|
| |
Calculate a checksum of the original uninstrumented module and emit it as part
of the profile data. When reading the profile, compare the checksum it contains
to the checksum of the module that is being split. Error out if the module being
split is not the same as the module that was originally instrumented.
Also fixes a bug in how the profile data was being read. When `char` is signed,
bytes read from the profile were being incorrectly sign extended. We had not
noticed this before because the profiles we have tested have contained only
small-valued counts.
|
|
|
|
|
|
|
|
|
|
| |
Extend the splitting logic to handle splitting modules with a single table
segment with a non-const offset. In this situation the placeholder function
names are interpreted as offsets from the table base global rather than absolute
indices into the table. Since addition is not allowed in segment offset
expressions, the secondary module's segment must start at the same place as the
first table's segment. That means that some primary functions must be duplicated
in the secondary segment to fill any gaps. They are exported and imported as
necessary.
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a nested type, we used to print e.g.
(param $x (ref (func (param i32))))
Instead of expanding the full type inline, which can get long for
a deeply nested type, print a name when running the Print pass.
In this example that would be something like
(param $x (ref $i32_=>_none))
|
|
|
|
| |
values (#3399)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bugs (#3401)
* Count signatures in tuple locals.
* Count nested signature types (confirming @aheejin was right, that was missing).
* Inlining was using the wrong type.
* OptimizeInstructions should return -1 for unhandled types, not error.
* The fuzzer should check for ref types as well, not just typed function references,
similar to what GC does.
* The fuzzer now creates a function if it has no other option for creating a constant
expression of a function type, then does a ref.func of that.
* Handle unreachability in call_ref binary reading.
* S-expression parsing fixes in more places, and add a tiny fuzzer for it.
* Switch fuzzer test to just have the metrics, and not print all the fuzz output which
changes a lot. Also fix noprint handling which only worked on binaries before.
* Fix Properties::getLiteral() to use the specific function type properly, and make
Literal's function constructor require that, to prevent future bugs.
* Turn all input types into nullable types, for now.
|
|
|
|
|
|
| |
Read the profiles produced by wasm-split's instrumentation to guide splitting.
In this initial implementation, all functions that the profile shows to have
been called are kept in the initial module. In the future, users may be able to
tune this so that functions that are run later will still be split out.
|
|
|
|
|
|
|
|
| |
Includes minimal support in various passes. Also includes actual optimization
work in Directize, which was easy to add.
Almost has fuzzer support, but the actual makeCallRef is just a stub so far.
Includes s-parser support for parsing typed function references types.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
types (#3388)
This adds the new feature and starts to use the new types where relevant. We
use them even without the feature being enabled, as we don't know the features
during wasm loading - but the hope is that given the type is a subtype, it should
all work out. In practice, if you print out the internal type you may see a typed
function reference-specific type for a ref.func for example, instead of a generic
funcref, but it should not affect anything else.
This PR does not support non-nullable types, that is, everything is nullable
for now. As suggested by @tlively this is simpler for now and leaves nullability
for later work (which will apparently require let or something else, and many
passes may need to be changed).
To allow this PR to work, we need to provide a type on creating a RefFunc. The
wasm-builder.h internal API is updated for this, as are the C and JS APIs,
which are breaking changes. cc @dcodeIO
We must also write and read function types properly. This PR improves
collectSignatures to find all the types, and also to sort them by the
dependencies between them (as we can't emit X in the binary if it depends
on Y, and Y has not been emitted - we need to give Y's index). This sorting
ends up changing a few test outputs.
InstrumentLocals support for printing function types that are not funcref
is disabled for now, until we figure out how to make that work and/or
decide if it's important enough to work on.
The fuzzer has various fixes to emit valid types for things (mostly
whitespace there). Also two drive-by fixes to call makeTrivial where it
should be (when we fail to create a specific node, we can't just try to make
another node, in theory it could infinitely recurse).
Binary writing changes here to replace calls to a standalone function to
write out a type with one that is called on the binary writer object itself,
which maintains a mapping of type indexes (getFunctionSignatureByIndex).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement an instrumentation pass that records the timestamp at which each
defined function is first called. Timestamps are not actual time, but rather
snapshots of a monotonically increasing counter. The instrumentation exports a
function that the embedder can call to dump the profile data into a memory
buffer at a given offset and size. The function returns the total size of the
profile data so the embedder can know how much to read out of the buffer or how
much it needs to grow the buffer.
Parsing and using the profile is left as future work, as is recording a hash of
the input file that will be used to guard against accidentally instrumenting one
module and trying to use the resulting profile to split a different module.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement an initial version of the wasm-split tool, which splits modules into a
primary module and a secondary module that can be instantiated after the primary
module. Eventually, this tool will be able to not only split modules, but also
instrument modules to collect profiles that will be able to guide later
splitting. In this initial version, however, wasm-split can neither perform
instrumentation nor consume any kind of profile data.
Despite those shortcomings, this initial version of the tool is already able to
perform module splitting according to function lists manually provided by the
user via the command line. Follow-up PRs will implement the stubbed out
instrumentation and profile consumption functionality.
|
|
|
|
|
|
|
|
|
|
|
| |
We did not really model the effects of unreachable properly before. It
always traps, so it's not an implicit trap, but we didn't do anything but mark
it as "branches out", which is not really enough, as while yes it does
branch inside the current function, it also traps which is noticeable outside.
To fix that, add a trap effect to track this. implicitTrap will set trap as well,
automatically, if we do not ignore implicit traps, so it is enough to check just
that (unless one cares about the difference between implicit and explicit
ones).
|
|
|
|
|
|
|
|
|
|
|
| |
If we take a reference of a function, it is dangerous to change the function's
type (which removing dead arguments does), as that would be an observable
different from the outside - the type changes, and some params are now ignored,
and others are reordered.
In theory we could find out if the reference does not escape, but that's not
trivial.
Related to #3378 but not quite the same.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lit and FileCheck are the tools used to run the majority of tests in LLVM. Each
lit test file contains the commands to be run for that test, so lit tests are
much more flexible and can be more precise than our current ad hoc testing
system. FileCheck reads expected test output from comments, so it allows test
output to be written alongside and interspersed with test input, making tests
more readable and precise than in our current system.
This PR adds a new suite to check.py that runs lit tests in the test/lit
directory. A few tests have been ported to demonstrate the features of the new
test runner.
This change is motivated by a need for greater flexibility in testing wasm-split.
See #3359.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- atomic.notify -> memory.atomic.notify
- i32.atomic.wait -> memory.atomic.wait32
- i64.atomic.wait -> memory.atomic.wait64
See WebAssembly/threads#149.
This renames instruction name printing but not the internal data
structure names, such as `AtomicNotify`, which are not always the same
as printed instruction names anyway. This also does not modify C API.
But this fixes interface functions in binaryen.js because it seems
binaryen.js's interface functions all follow the corresponding
instruction names.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to check if a load's sign matters before hashing it. If the load does
not extend, then the sign doesn't matter, and we ignored the value there. It
turns out that value could be garbage, as we didn't assign it in the binary
reader, if it wasn't relevant. In the rewrite this was missed, and actually it's
not really possible to do, since we have just a macro for the field, but not the
object it is on - which there may be more than one.
To fix this, just always assign the field. This is simpler anyhow, and avoids
confusion not just here but probably when debugging.
The testcase here is reduced from the fuzzer, and is not a 100% guarantee
to catch a read of uninitialized memory, but it can't hurt, and with ASan it
may be pretty consistent.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds the capability to programatically split a module into a primary and
secondary module such that the primary module can be compiled and run before the
secondary module has been instantiated. All calls to secondary functions (i.e.
functions that have been split out into the secondary module) in the primary
module are rewritten to be indirect calls through the table. Initially, the
table slots of all secondary functions contain references to imported
placeholder functions. When the secondary module is instantiated, it will
automatically patch the table to insert references to the original functions.
The process of module splitting involves these steps:
1. Create the new secondary module.
2. Export globals, events, tables, and memories from the primary module and
import them in the secondary module.
3. Move the deferred functions from the primary to the secondary module.
4. For any secondary function exported from the primary module, export in
its place a trampoline function that makes an indirect call to its
placeholder function (and eventually to the original secondary function),
allocating a new table slot for the placeholder if necessary.
5. Rewrite direct calls from primary functions to secondary functions to be
indirect calls to their placeholder functions (and eventually to their
original secondary functions), allocating new table slots for the
placeholders if necessary.
6. For each primary function directly called from a secondary function, export
the primary function if it is not already exported and import it into the
secondary module.
7. Replace all references to secondary functions in the primary module's table
segments with references to imported placeholder functions.
8. Create new active table segments in the secondary module that will replace
all the placeholder function references in the table with references to
their corresponding secondary functions upon instantiation.
Functions can be used or referenced three ways in a WebAssembly module: they can
be exported, called, or placed in a table. The above procedure introduces a
layer of indirection to each of those mechanisms that removes all references to
secondary functions from the primary module but restores the original program's
semantics once the secondary module is instantiated. As more mechanisms that
reference functions are added in the future, such as ref.func instructions, they
will have to be modified to use a similar layer of indirection.
The code as currently written makes a few assumptions about the module that is
being split:
1. It assumes that mutable-globals is allowed. This could be worked around by
introducing wrapper functions for globals and rewriting secondary code that
accesses them, but now that mutable-globals is shipped on all browsers,
hopefully that extra complexity won't be necessary.
2. It assumes that all table segment offsets are constants. This simplifies the
generation of segments to actively patch in the secondary functions without
overwriting any other table slots. This assumption could be relaxed by 1)
having secondary segments re-write primary function slots as well, 2)
allowing addition in segment offsets, or 3) synthesizing a start function to
modify the table instead of using segments.
3. It assumes that each function appears in the table at most once. This isn't
necessarily true in general or even for LLVM output after function
deduplication. Relaxing this assumption would just require slightly more
complex code, so it is a good candidate for a follow up PR.
Future Binaryen work for this feature includes providing a command line tool
exposing this functionality as well as C API, JS API, and fuzzer support. We
will also want to provide a simple instrumentation pass for finding dead or
late-executing functions that would be good candidates for splitting out. It
would also be good to integrate that instrumentation with future function
outlining work so that dead or exceptional basic blocks could be split out into
a separate module.
|
|
|
|
|
|
|
|
|
|
|
|
| |
OptimizeInstructions is seeing the most work these days, so it's good for
the fuzzer to focus on that some more.
Also move some code around in the main test wast: it's useful to put each
feature in its own module to maximize the chance of getting them to be used.
That is, if a module has a single use of atomics, then if atomics are disabled
in the current run, we can't use any of the module and we skip initial contents
entirely. Moving each feature to it's own module reduces that risk. (We do
pick randomly between the modules, and atm a small module has the same
chance as a big one, but this still seems worth it.)
|
|
|
|
| |
See discussion in #3303
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
X - Y <= 0
=>
X <= Y
That is true mathematically, but not in the case of an overflow, e.g.
X=10, Y=0x8000000000000000. X - Y is a negative number, so
X - Y <= 0 is true. But it is not true that X <= Y (as Y is negative, but
X is not).
See discussion in #3303 (comment)
The actual regression was in #3275, but the fuzzer had an easier time
finding it due to #3303
|
|
|
|
|
|
| |
We mistakenly tried to run all passes there, but should run only
the function ones.
Fixes #3333
|
|
|
|
|
|
|
|
|
| |
This is because we maybe need to reference the segments
during the start function. For example in the case of
pthreads we conditionally load passive segments during
start.
Tested in emscripten with: tests/runner.py wasm2js1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Expands on #3294:
* Scope names must be distinguished as either defs or uses.
* Error when a core #define is missing, which is less error-prone, as
suggested by @tlively
* Add DELEGATE_GET_FIELD which lets one define "get the field"
once and then all the loops can use it. This helps avoid boilerplate for
loops at least in some cases (when there is a single object on which
to get the field).
With those, it is possible to replace boilerplate in comparisons and
hashing logic. This also fixes a bug where BrOnExn::sent was not
scanned there.
Add some unit tests for hashing. We didn't have any, and hashing can be
subtly wrong without observable external effects (just more collisions).
|
|
|
|
|
| |
The asmFunc now sets the outer scope's `bufferView` variable
as well as its own internal views.
|
|
|
|
|
|
| |
bool(i32(x) % C_pot) -> bool(i32(x) & (C_pot - 1))
bool(i32(x) % min_s) -> bool(i32(x) & max_s)
For all other situations we already do this for (i32|i64).rem_s
|
| |
|
|
|
|
|
|
|
| |
Using addition in more places is better for gzip, and helps simplify the
optimizer as well.
Add a FinalOptimizer phase to do optimizations like our signed LEB tweaks, to
reduce binary size in the rare case when we do want a subtraction.
|
|
|
| |
Specifically try to cleanup use of asm_v_wasm.h and asmjs constants.
|
|
|
| |
We no longer build modules that import `global.Math`.
|
| |
|
| |
|
|
|
|
|
|
|
| |
Division and remainder do not have an implicit trap if the
right-hand side is a constant and not one of the dangerous
values there.
Also refactor ignoreImplicitTrap handling for clarity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can still make x * -1.0 cheaper for non-fastMath mode as:
x * -1.0 -> -0.0 - x
Should at least help baseline compilers.
Also it could enable further optimizations, e.g.:
a + b * -1
a + (-0.0 - b)
(a - 0.0) - b
a - b
|
|
|
| |
We can only pack memory if we know it is zero-filled before us.
|
|
|
|
|
| |
Selectify turns an if-else into a select where possible. Previously we abandoned
hope if any part of the if had a side effect. But it's fine for the condition to have a
side effect, so long as moving it to the end doesn't invalidate the arms.
|
|
|
|
|
|
|
| |
Make select cost more realistic - it should be as good as a jmp, as in an if.
Add missing child visiting.
Shorten repetitive cases in switches.
|
| |
|
|
|
|
|
|
|
| |
Specifically, pick a simple positive canonical NaN as the NaN output, when the output
is a NaN. This is the same as what tools like wabt do.
This fixes a testcase found by the fuzzer on #3289 but it was not that PR's fault.
|
|
|
| |
Fixed bug in memory64-lowering pass for memory.size/grow
|
|
|
|
|
| |
`C1 - (x + C2)` -> `(C1 - C2) - x`
`C1 - (x - C2)` -> `(C1 + C2) - x`
`C1 - (C2 - x)` -> `x + (C1 - C2)`
|
|
|
|
| |
Without this, we might think a function has no global uses if the only
global use of it is the start.
|
|
|
|
|
|
| |
Emscripten no longer needs this information as of
https://github.com/emscripten-core/emscripten/pull/12643.
This also removes the need to export __data_end.
|
| |
|
|
|
| |
Followup to #3276
|
| |
|
|
|
|
|
|
|
|
| |
Fixes a fuzz bug that was triggered by
https://github.com/WebAssembly/binaryen/pull/3015#issuecomment-718001620
but was actually a pre-existing bug in pow2, that that PR just happened
to uncover.
|
|
|
|
|
|
|
| |
Including saturating, rounding Q15 multiplication as proposed in
https://github.com/WebAssembly/simd/pull/365 and extending multiplications as
proposed in https://github.com/WebAssembly/simd/pull/376. Since these are just
prototypes, skips adding them to the C or JS APIs and the fuzzer, as well as
implementing them in the interpreter.
|