| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
| |
See https://github.com/WebAssembly/memory64/pull/92
|
|
|
|
| |
As the name of a class, uppercase seems better here.
|
|
|
|
|
| |
Without this the fuzzer can error on differences in behavior between V8 and us.
Also move the limitations constants to their own header.
|
|
|
| |
Followup to #6243 which handled empty ones.
|
|
|
|
|
|
| |
They might trap. Leave that for RemoveUnusedModuleElements.
Fixes #6230
|
|
|
|
|
|
|
| |
`_VECTOR` or `_ARRAY` defines in `wasm-delegations-fields.def` are
supposed to be defined in terms of their non-vector/array counterparts
when undefined. This removes empty `_VECTOR`/`_ARRAY` defines when
including `wasm-delegations-fields.def`, while adding definitions for
`DELEGATE_GET_FIELD` in case it is missing.
|
|
|
| |
Renaming the multimemory flag in Binaryen to match its naming in LLVM.
|
|
|
|
|
|
| |
Fixes emscripten-core/emscripten#17485
This allows emscripten to complie code with MEMORY64 + PTHREADS by
fixing using the proper pointer type in the MemoryPacking pass.
|
|
|
|
|
|
|
|
|
|
|
|
| |
wasm-delegations-fields (#5690)
This makes delegations-fields track Kinds. That is, rather than say a field is
just a Name, we can say it is a name of kind Function. This allows users to
track references to functions, tables, memories, etc., in a simple and generic
way, avoiding duplicated code which we have atm. (In particular this will help
wasm-merge in the future.)
This also uses that functionality in two small places to show the benefits
(see memory-utils.cpp and MemoryPacking.cpp).
|
|
|
|
|
|
|
|
|
|
|
| |
Data/Elem (#5692)
ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem
ArrayInit => ArrayInitData, ArrayInitElem
Basically we remove the opcode and use the class type to differentiate them.
This adds some code but it makes the representation simpler and more compact in
memory, and it will help with #5690
|
|
|
|
|
|
|
|
| |
Do not optimize out or split segments that are referred to array.init_data
instructions. Fixes a bug where segments could get optimized out, producing
invalid modules. Doing the work to actually split segments used by
array.init_data is left for the future.
Also fix a latent UBSan failure revealed by the new test case.
|
|
|
|
|
|
|
|
| |
Previously, the pointer type for newly emitted instructions was determined by
the type of the destination pointer on a memory.init instruction, but that did
not take into account that the destination pointer may be unreachable. Properly
look up the pointer type on the memory instead to fix the problem.
Fixes #5620.
|
|
|
|
|
|
|
|
|
|
| |
All top-level Module elements are identified and referred to by Name, but for
historical reasons element and data segments were referred to by index instead.
Fix this inconsistency by using Names to refer to segments from expressions that
use them. Also parse and print segment names like we do for other elements.
The C API is partially converted to use names instead of indices, but there are
still many functions that refer to data segments by index. Finishing the
conversion can be done in the future once it becomes necessary.
|
|
|
|
|
|
| |
Fix the relevant pointer and size expressions produced by MemoryPacking to be
i64s when working with 64-bit memories.
Fixes #5578.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Update MemoryPacking for array.new_data
The MemoryPacking pass looks at all instructions that reference memory segments
to determine how they can be optimized. #5214 introduced a new instruction that
references memory segments, array.new_data, but did not update MemoryPacking
accordingly. This omission meant that MemoryPacking could produce invalid or
misoptimized modules in the presence of array.new_data.
Fix the problem by making MemoryPacking aware of array.new_data. Consider
array.new_data when determining whether a segment is used and update
array.new_data to reflect the new, optimized segment numberings afterward. To
keep things simple, do not try to split any segment that is referred to by
a array.new_data instruction.
* fix
* Add test explanations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the goal of supporting null characters (i.e. zero bytes) in strings.
Rewrite the underlying interned `IString` to store a `std::string_view` rather
than a `const char*`, reduce the number of map lookups necessary to intern a
string, and present a more immutable interface.
Most importantly, replace the `c_str()` method that returned a `const char*`
with a `toString()` method that returns a `std::string`. This new method can
correctly handle strings containing null characters. A `const char*` can still
be had by calling `data()` on the `std::string_view`, although this usage should
be discouraged.
This change is NFC in spirit, although not in practice. It does not intend to
support any particular new functionality, but it is probably now possible to use
strings containing null characters in at least some cases. At least one parser
bug is also incidentally fixed. Follow-on PRs will explicitly support and test
strings containing nulls for particular use cases.
The C API still uses `const char*` to represent strings. As strings containing
nulls become better supported by the rest of Binaryen, this will no longer be
sufficient. Updating the C and JS APIs to use pointer, length pairs is left as
future work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously only WalkerPasses had access to the `getPassRunner` and
`getPassOptions` methods. Move those methods to `Pass` so all passes can use
them. As a result, the `PassRunner` passed to `Pass::run` and
`Pass::runOnFunction` is no longer necessary, so remove it.
Also update `Pass::create` to return a unique_ptr, which is more efficient than
having it return a raw pointer only to have the `PassRunner` wrap that raw
pointer in a `unique_ptr`.
Delete the unused template `PassRunner::getLast()`, which looks like it was
intended to enable retrieving previous analyses and has been in the code base
since 2015 but is not implemented anywhere.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An overview of this is in the README in the diff here (conveniently, it is near the
top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps
things easy to reason about - what validates is what is valid wasm - but there are
some minor nuances as mentioned there, in particular, we ignore nameless blocks
(which are commonly added by various passes; ignoring them means we can keep
more locals non-nullable).
The key addition here is LocalStructuralDominance which checks which local
indexes have the "structural dominance" property of 1a, that is, that each get has
a set in its block or an outer block that precedes it. I optimized that function quite
a lot to reduce the overhead of running that logic after each pass. The overhead
is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we
shrink code size, so there is less work actually, and it balances out).
Since we run fixups after each pass, this PR removes logic to manually call the
fixup code from various places we used to call it (like eh-utils and various passes).
Various passes are now marked as requiresNonNullableLocalFixups => false.
That lets us skip running the fixups after them, which we normally do automatically.
This helps avoid overhead. Most passes still need the fixups, though - any pass
that adds a local, or a named block, or moves code around, likely does.
This removes a hack in SimplifyLocals that is no longer needed. Before we
worked to avoid moving a set into a try, as it might not validate. Now, we just do it
and let fixups happen automatically if they need to: in the common code they
probably don't, so the extra complexity seems not worth it.
Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a
nondefaultable local. But we have the logic to fix that up now, and opts will
likely keep it non-nullable as well.
Various tests end up updated here because now a local can be non-nullable -
previous fixups are no longer needed.
Note that this doesn't remove the gc-nn-locals feature. That has been useful for
testing, and may still be useful in the future - it basically just allows nn locals in
all positions (that can't read the null default value at the entry). We can consider
removing it separately.
Fixes #4824
|
|
|
|
|
|
|
| |
This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction.
It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format.
There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Updating wasm.h/cpp for DataSegments
* Updating wasm-binary.h/cpp for DataSegments
* Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal
* checking isPassive when copying data segments to know whether to construct the data segment with an offset or not
* Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp
* Updated wasm-interpreter
* First look at updating Passes
* Updated wasm-s-parser
* Updated files in src/ir
* Updating tools files
* Last pass on src files before building
* added visitDataSegment
* Fixing build errors
* Data segments need a name
* fixing var name
* ran clang-format
* Ensuring a name on DataSegment
* Ensuring more datasegments have names
* Adding explicit name support
* Fix fuzzing name
* Outputting data name in wasm binary only if explicit
* Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames
* Pass on when data segment names are explicitly set
* Ran auto_update_tests.py and check.py, success all around
* Removed an errant semi-colon and corrected a counter. Everything still passes
* Linting
* Fixing processing memory names after parsed from binary
* Updating the test from the last fix
* Correcting error comment
* Impl kripken@ comments
* Impl tlively@ comments
* Updated tests that remove data print when == 0
* Ran clang format
* Impl tlively@ comments
* Ran clang-format
|
|
|
|
|
|
|
|
| |
247f4c20a1 introduced a bug that caused expressions that refer to data segments
to be associated with the wrong segments in the presence of other segments that
have no referring expressions at all.
Fixes #4569.
Fixes #4571.
|
|
|
|
|
|
|
| |
V8 used to incorrectly parse the segment index on bulk memory instructions as a
signed LEB rather than an unsigned LEB, which meant it only worked correctly
with up to 63 data segments. That bug has been fixed for over two years now, so
lift the maximum number of segments generated by memory packing from 63 to the
specified max, 100k.
|
|
|
|
|
|
|
|
|
|
|
| |
The memory packing pass previously used vectors with a slot for each data
segment to easily map segment indices to lists of "referrers," i.e. bulk memory
expressions that referred to the segments. The parallel analysis pass to collect
these referrers allocated one of these vectors for each function in a module.
Unfortunately, for a particularly large module with over 200k functions and over
75k data segments, this resulted in hundreds of gigabytes of memory allocations.
The vast majority of functions contain no referrers, so using a
std::unordered_map makes much more sense than using a vector to perform this
mapping.
|
|
|
| |
This as a consequence of https://reviews.llvm.org/D95651
|
|
|
|
|
| |
Also, avoid packing builtin llvm segments names so that
segments such as `__llvm_covfun` (use by llvm-cov) are
preserved in the final output.
|
|
|
| |
We can only pack memory if we know it is zero-filled before us.
|
|
|
|
|
|
|
|
|
| |
Such a collision can happen if we run the pass twice, and somehow it finds
more to optimize.
To make this easy, add a general utility for getting a unique name based on
a root + a numeric suffix to avoid collisions.
Fixes the second testcase in #3225
|
|
|
|
|
| |
We may fix this eventually, but it appears to not be urgent. For now at least
show a warning so toolchains have a chance to see there is something
they should fix.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I believe originally wasm did not allow overlapping segments, that is, where
one memory segment tramples the data from a previous one. But then the
spec changed its mind and we allowed it. Binaryen seems to have assumed
the original case, and not checked for trampling.
If there is a chance of trampling, we cannot optimize out zeros - the zero
may have an effect if it tramples data from a previous segment. This does
not occur in practice in LLVM output, which is why this wasn't a problem
so far, I think.
An existing testcase hit this issue, so I split it up.
|
|
|
|
|
|
|
|
|
|
|
| |
This PR fixes a bug in which the segment index of a memory.init instruction was
incorrect in some circumstances. Specifically, the first segment index used in
output memory.init instructions was always the index of the first segment
created from splitting up the corresponding input segment. This was incorrect
when the input memory.init had an offset that caused it to skip over that first
emitted segment so that the first output memory.init should have referred to a
subsequent output segment.
Fixes #3225.
|
|
|
| |
When there are two versions of a function, one handling tuples and the other handling non-tuple values, the previous naming convention was to have "Single" in the name of the non-tuple handling function. This PR simplifies the convention and shortens function names by making the names plural for the tuple-handling version and singular for the non-tuple-handling version.
|
|
|
| |
Also includes a lot of new spec tests that eventually need to go into the spec repo
|
|
|
| |
Aligns the internal representations of `memory.size` and `memory.grow` with other more recent memory instructions by removing the legacy `Host` expression class and adding separate expression classes for `MemorySize` and `MemoryGrow`. Simplifies related APIs, but is also a breaking API change.
|
| |
|
|
|
|
|
|
| |
This involves replacing `Literal::makeZero` with `Literal::makeZeroes`
and `Literal::makeSingleZero` and updating `isConstantExpression` to
handle constant tuples as well. Also makes `Literals` its own struct
and adds convenience methods on it.
|
|
|
|
|
|
| |
Chrome is currently decoding the segment indices as signed numbers, so
some ranges of indices greater than 63 do not work. As a temporary
workaround, limit the number of segments produced by MemoryPacking to
63 when bulk-memory is enabled.
|
|
|
|
|
|
|
|
|
| |
When memory is packed and there are passive segments, bulk memory
operations that reference those segments by index need to be updated to
reflect the new indices and possibly split into multiple instructions
that reference multiple split segments. For some bulk-memory operations,
it is necessary to introduce new globals to explicitly track the drop
state of the original segments, but this PR is careful to only add
globals where necessary.
|
|
|
|
| |
Because `memory.size` returns the size in number of pages, we have to
multiply the size with the page size when converting `memory.init`.
|
|
|
|
|
|
|
|
|
|
|
| |
This does two things:
- Restore `visitDataDrop` handler deleted in #2529, but now we convert
invalid `data.drop`s to not `unreachable` but `nop`. This conforms to
the revised spec that `data.drop` on the active segment can be treated
as a nop.
- Make `visitMemoryInit` trap if offset or size are not equal to 0 or if
the dest address is out of bounds. Otherwise drop all its argument.
Fixes #2535.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements recent bulk memory spec changes
(WebAssembly/bulk-memory-operations#126) in Binaryen. Now `data.drop` is
equivalent to shrinking a segment size to 0, and dropping already
dropped segments or active segments (which are thought to be dropped in
the beginning) is treated as a no-op. And all bounds checking is
performed in advance, so partial copying/filling/initializing does not
occur.
I tried to implement `visitDataDrop` in the interpreter as
`segment.data.clear();`, which is exactly what the revised spec says. I
didn't end up doing that because this also deletes all contents from
active segments, and there are cases we shouldn't do that:
- `wasm-ctor-eval` shouldn't delete active segments, because it will
store the changed contents back into segments
- When `--fuzz-exec` is given to `wasm-opt`, it runs the module and
compare the execution call results before and after transformations.
But if running a module will nullify all active segments, applying
any transformation to the module or re-running it does not make any
sense.
|
|
|
|
| |
#2242 had exposed the bug that the `Trapper` pass was defining `walkFunction` when it should have been defining `doWalkFunction`.
|
|
|
|
|
| |
(#2244)
This reverts commit 72c52ea7d4eb61b95cf8a5164947cb760fe42e9c, which was causing test failures after it merged.
|
|
|
|
| |
This prevents those instructions from becoming invalid due to memory
packing optimizations and is also a code size win. Fixes #2227.
|
|
|
|
| |
Allow MemoryPacking to run when there are no passive segments, even if
bulk memory is enabled.
|
|
|
| |
Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
|
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
|
|
| |
This allows us to emit a (potentially modified) target features
section and conditionally emit other sections such as the DataCount
section based on the presence of features.
|
|
|
|
|
|
| |
It was previously part of writing a binary, but changing the number of
segments at such a late stage would not work in the presence of bulk
memory's datacount section. Also updates the memory packing pass
to respect the web's limits on the number of data segments.
|
|
|
|
|
| |
Adds support for the bulk memory proposal's passive segments. Uses a
new (data passive ...) s-expression syntax to mark sections as
passive.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format.
This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented,
DCE
local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get.
Block removal - remove any blocks with no branches, as they are valid in wasm binary format.
Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR.
Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes:
Generate stack IR
Optimize stack IR (might be worth splitting out into separate passes eventually)
Print stack IR for debugging purposes
Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR.
Miscellaneous notes:
Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways.
Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present.
A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above.
The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed.
Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid.
Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
|