| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
All logging/instrumentation passes need to do this, to avoid us using stale
global effects that are too low (too high is not optimal either, but at least it
cannot cause bugs).
|
|
|
|
| |
This code predates our adoption of C++14 and can now be removed in favor of
`std::make_unique`, which should be more efficient.
|
|
|
|
|
|
|
|
|
|
| |
Previously we treated global.get as a constant expression and only
additionally verified that the target globals were immutable in some cases. But
global.get of a mutable global is never a constant expression, and further,
only imported globals are available in constant expressions unless GC is
enabled.
Fix constant expression validation to only allow global.get of immutable,
imported globals, and fix all the invalid tests.
|
|
|
|
|
| |
The assertion that the offset is zero does not necessarily hold for code that
uses this instruction via the clang builtin. Add support so that Emscripten
wasm2js tests pass in the presence of such code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the goal of supporting null characters (i.e. zero bytes) in strings.
Rewrite the underlying interned `IString` to store a `std::string_view` rather
than a `const char*`, reduce the number of map lookups necessary to intern a
string, and present a more immutable interface.
Most importantly, replace the `c_str()` method that returned a `const char*`
with a `toString()` method that returns a `std::string`. This new method can
correctly handle strings containing null characters. A `const char*` can still
be had by calling `data()` on the `std::string_view`, although this usage should
be discouraged.
This change is NFC in spirit, although not in practice. It does not intend to
support any particular new functionality, but it is probably now possible to use
strings containing null characters in at least some cases. At least one parser
bug is also incidentally fixed. Follow-on PRs will explicitly support and test
strings containing nulls for particular use cases.
The C API still uses `const char*` to represent strings. As strings containing
nulls become better supported by the rest of Binaryen, this will no longer be
sufficient. Updating the C and JS APIs to use pointer, length pairs is left as
future work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously only WalkerPasses had access to the `getPassRunner` and
`getPassOptions` methods. Move those methods to `Pass` so all passes can use
them. As a result, the `PassRunner` passed to `Pass::run` and
`Pass::runOnFunction` is no longer necessary, so remove it.
Also update `Pass::create` to return a unique_ptr, which is more efficient than
having it return a raw pointer only to have the `PassRunner` wrap that raw
pointer in a `unique_ptr`.
Delete the unused template `PassRunner::getLast()`, which looks like it was
intended to enable retrieving previous analyses and has been in the code base
since 2015 but is not implemented anywhere.
|
|
|
|
|
|
|
| |
This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction.
It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format.
There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
With nominal function types, this change makes it so that we preserve the
identity of the function type used with call_indirect instructions rather than
recreating a function heap type, which may or may not be the same as the
originally parsed heap type, from the function signature during module writing.
This will simplify the type system implementation by removing the need to store
a "canonical" nominal heap type for each unique signature. We previously
depended on those canonical types to avoid creating multiple duplicate function
types during module writing, but now we aren't creating any new function types
at all.
|
|
|
|
|
|
|
|
|
| |
When using nominal types, func.ref of two functions with identical signatures
but different HeapTypes will yield different types. To preserve these semantics,
Functions need to track their HeapTypes, not just their Signatures.
This PR replaces the Signature field in Function with a HeapType field and adds
new utility methods to make it almost as simple to update and query the function
HeapType as it was to update and query the Function Signature.
|
|
|
| |
Adds support for modules with multiple tables. Adds a field for the table name to `CallIndirect` and updates the C/JS APIs accordingly.
|
|
|
|
|
|
|
|
|
| |
When Functions, Globals, Events, and Exports are added to a module, if they are
not already in std::unique_ptrs, they are wrapped in a new std::unique_ptr owned
by the Module. This adds an extra layer of indirection when accessing those
elements that can be avoided by allocating those elements as std::unique_ptrs.
This PR updates wasm-builder to allocate module elements via std::make_unique
rather than `new`. In the future, we should remove the raw pointer versions of
Module::add* to encourage using std::unique_ptrs more broadly.
|
|
|
| |
Specifically try to cleanup use of asm_v_wasm.h and asmjs constants.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#3253)
validateGlobally means that we can't do lookups on the module. A few places
were missing that, or had it wrong. I think the reason for the wrong usages is
that we used to have types on the module, and then removed that, so more is
now validatable actually.
This uncovered a real bug, where i64-to-32 would ignore an unreachable
parameter of a call_indirect. That's bad, since if the type is i64, we need
to replace it with two parameters. To fix that, just handle unreachability
there, using the existing logic (which skips the call_indirect entirely in
this case).
|
|
|
| |
Since they make the code clearer and more self-documenting.
|
|
|
| |
This leads to simpler code and is a prerequisite for #3012, which makes it so that not all `Type`s are backed by vectors that `expand` could return.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a follow-up to https://github.com/WebAssembly/binaryen/pull/3012#pullrequestreview-459686171 this PR prepares for the new compound Signature, Struct and Array types that are single but not basic.
This includes:
* Renames `Type::getSingle` to `Type::getBasic` (NFC). Previously, its name was not representing its implementation (`isSingle` excluded `none` and `unreachable` while `getSingle` didn't, i.e. `getSingle` really was `getBasic`). Note that a hypothetical `Type::getSingle` cannot return `ValueType` anyway (new compound types are single but don't map to `ValueType`), so I figured it's best to skip implementing it until we actually need it.
* Marks locations where we are (still) assuming that all single types are basic types, as suggested in https://github.com/WebAssembly/binaryen/pull/3012#discussion_r465356708, but using a macro, so we get useful errors once we start implementing the new types and can quickly traverse the affected locations.
The macro is added where
* there used to be a `switch (type.getSingle())` or similar that handled any basic type (NFC), but in the future will also have to handle single types that are not basic types.
* we are not dealing with `Unary`, `Binary`, `Load`, `Store` or `AtomicXY` instructions, since these don't deal with compound types anyway.
|
|
|
|
|
|
|
|
| |
This doesn't lower them - it just replaces the unsupported operation
with a drop. This will be useful for fuzzing, where to compare JS to the
correct semantics we must avoid operations where JS is not always
accurate.
Also fully document the i64 -> f32 conversion issue in JS.
|
| |
|
|
|
|
|
| |
Atomic loads, stores, RMW, cmpXchg, wait, and notify. This is enough
to get the asm.js atomics tests in the emscripten test suite to pass, at least
(but they are a subset of the entire pthreads suite).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a special helper functions for data.drop etc., as unlike most
wasm instructions these are too big to emit inline.
Track passive segments at runtime in var memorySegments
whose indexes are the segment indexes.
Emit var bufferView even if the memory exists even without
memory segments, as we do still need the view in order to
operate on it.
Also adds a few constants for atomics that will be useful in future
PRs (as this PR updates the constant lists anyhow).
|
|
|
|
|
|
|
|
|
|
| |
* Remove implicit conversion operators from Type
Now types must be explicitly converted to uint32_t with Type::getID or
to ValueType with Type::getVT. This fixes #2572 for switches that use
Type::getVT.
* getVT => getSingle
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.
Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.
The C and JS APIs and their tests also had to be updated to remove
named function types.
|
|
|
|
|
| |
This works more like llvm's unreachable handler in that is preserves
information even in release builds.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds the ability to create multivalue types from vectors of concrete value
types. All types are transparently interned, so their representation is still a
single uint32_t. Types can be extracted into vectors of their component parts,
and all the single value types expand into vectors containing themselves.
Multivalue types are not yet used in the IR, but their creation and inspection
functionality is exposed and tested in the C and JS APIs.
Also makes common type predicates methods of Type and improves the ergonomics of
type printing.
|
|
|
|
| |
Adds tail call support to fuzzer and makes small changes to handle return calls in multiple utilities and passes. Makes larger changes to DAE and inlining passes to properly handle tail calls.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Reflected new renamed instruction names in code and tests:
- `get_local` -> `local.get`
- `set_local` -> `local.set`
- `tee_local` -> `local.tee`
- `get_global` -> `global.get`
- `set_global` -> `global.set`
- `current_memory` -> `memory.size`
- `grow_memory` -> `memory.grow`
- Removed APIs related to old instruction names in Binaryen.js and added
APIs with new names if they are missing.
- Renamed `typedef SortedVector LocalSet` to `SetsOfLocals` to prevent
name clashes.
- Resolved several TODO renaming items in wasm-binary.h:
- `TableSwitch` -> `BrTable`
- `I32ConvertI64` -> `I32WrapI64`
- `I64STruncI32` -> `I64SExtendI32`
- `I64UTruncI32` -> `I64UExtendI32`
- `F32ConvertF64` -> `F32DemoteI64`
- `F64ConvertF32` -> `F64PromoteF32`
- Renamed `BinaryenGetFeatures` and `BinaryenSetFeatures` to
`BinaryenModuleGetFeatures` and `BinaryenModuleSetFeatures` for
consistency.
|
|
|
| |
Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
|
|
|
| |
If an i64 load/store that is being broken up has higher alignment, use that.
|
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
| |
Also run remove-unused-names which became more noticeably necessary after this change.
|
|
|
| |
Also test in pass-debug mode, for better coverage.
|
|
|
|
|
|
|
| |
This replaces all uses of __tempMemory__, the old scratch space location, with calls to function imports for scratch memory access. This lets us then implement those in a way that does not use the same heap as main memory. This avoids possible bugs with scratch memory overwriting something, or just in general that it has observable side effects, which can confuse fuzzing etc.
The intrinsics are currently implemented in the glue. We could perhaps emit them inline instead (but that might limit asm.js optimizations, so I wanted to keep our options open for now - easy to change later).
Also fixes some places where we used 0 as the scratch space address.
|
| |
|
|
|
| |
Split them into two i32 globals.
|
|
|
| |
Fixes #2007 #2008
|
|
|
|
|
|
|
|
| |
* I64ToI32Lowering - don't assume address 0 is a hardcoded location for scratch memory. Import __tempMemory__ for that.
* RemoveNonJSOps - also use __tempMemory__. Oddly here the address was a hardcoded 1024 (perhaps where the rust program put a static global?).
* Support imported ints in wasm2js, coercing them as needed.
* Add "env" import support in the tests, since now we emit imports from there.
* Make wasm2js tests split out multi-module tests using split_wast which is more robust and avoids emitting multiple outputs in one file (which makes no sense for ES6 modules)
|
|
|
|
|
|
|
| |
Minus multi-memory which we don't support yet.
Improve validator.
Fix some minor validation issues in our tests.
|
| |
|
| |
|
|
|
| |
While debugging to fix the waterfall regressions I noticed that wasm-reduce regressed. We need to be more careful with visitFunction which now may visit an imported function - I found a few not-well-tested passes that also regressed that way.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #1649
This moves us to a single object for functions, which can be imported or nor, and likewise for globals (as a result, GetGlobals do not need to check if the global is imported or not, etc.). All imported things now inherit from Importable, which has the module and base of the import, and if they are set then it is an import.
For convenient iteration, there are a few helpers like
ModuleUtils::iterDefinedGlobals(wasm, [&](Global* global) {
.. use global ..
});
as often iteration only cares about imported or defined (non-imported) things.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format.
This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented,
DCE
local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get.
Block removal - remove any blocks with no branches, as they are valid in wasm binary format.
Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR.
Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes:
Generate stack IR
Optimize stack IR (might be worth splitting out into separate passes eventually)
Print stack IR for debugging purposes
Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR.
Miscellaneous notes:
Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways.
Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present.
A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above.
The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed.
Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid.
Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Import `abort` from the environment
* Add passing spec tests
* Bind the abort function
* wasm2asm: Fix name collisions
Currently function names and local names can collide in namespaces, causing
buggy results when a function intends to call another function but ends up using
a local value as the target!
This fix was required to enable the `fac` spec test
* wasm2asm: Get multiple modules in one file working
The spec tests seem to have multiple modules defined in some tests and the
invocations all use the most recently defined module. This commit updates the
`--allow-asserts` mode of wasm2asm to work with this mode of tests, enabling us
to enable more spec tests for wasm2asm.
* wasm2asm: Enable the float_literals spec test
This needed to be modified to account for how JS engines don't work with NaN
bits the same way, but it's otherwise largely the same test. Additionally it
turns out that asm.js doesn't accept either `Infinity` or `NaN` ambient globals
so they needed to get imported through the `global` variable rather than defined
as literals in code
* wasm2asm: Fix function pointer invocations
This commit fixes invocations of functions through function pointers as
previously the table names on lookup and definition were mismatched. Both tables
now go through signature-based namification rather than athe name of the type
itself.
Overall this enables a slew of spec tests
* wasm2asm: Enable the left-to-right spec test
There were two small bugs in the order of evaluation of operators with
wasm2asm. The `select` instruction would sometimes evaluate the condition first
when it was supposed to be last. Similarly a `call_indirect` instruction would
evaluate the function pointer first when it was supposed to be evaluated last.
The `select` instruction case was a relatively small fix but the one for
`call_indirect` was a bit more pessimized to generate some temporaries.
Hopefully if this becomes up a problem it can be tightened up.
* wasm2asm: Fix signed load promotions of 64-bit ints
This commit enables the `endianness` spec test which revealed a bug in 64-bit
loads from smaller sizes which were signed. Previously the upper bits of the
64-bit number were all set to zero but the fix was for signed loads to have all
the upper bits match the highest bit of the low 32 bits that we load.
* wasm2asm: Enable the `stack` spec test
Internally the spec test uses a mixture of the s-expression syntax and the wat
syntax, so this is copied over into the `wasm2asm` folder after going through
`wat2wasm` to ensure it's consistent for binaryen.
* wasm2asm: Fix unaligned loads/stores of floats
Replace these operations in `RemoveNonJSOps` by using reinterpretation to
translate floats to integers and then use the existing code for unaligned
loads/stores of integers.
* wasm2asm: Fix a tricky grow_memory codegen bug
This commit fixes a tricky codegen bug found in the `grow_memory` instruction.
Specifically if you stored the result of `grow_memory` immediately into memory
it would look like:
HEAP32[..] = __wasm_grow_memory(..);
Here though it looks like JS evaluates the destination *before* the grow
function is called, but the grow function will invalidate the destination!
Furthermore this is actually generalizable to all function calls:
HEAP32[..] = foo(..);
Because any function could transitively call `grow_memory`. This commit fixes
the issue by ensuring that store instructions are always considered statements,
unconditionally evaluating the value into a temporary and then storing that into
the destination. While a bit of a pessmimization for now it should hopefully fix
the bug here.
* wasm2asm: Handle offsets in tables
This commit fixes initializing tables whose elements have an initial offset.
This should hopefully help fix some more Rust code which has all function
pointers offset by default!
* Update tests
* Tweak * location on types
* Rename entries of NameScope and document fromName
* Comment on lowercase names
* Update compiled JS
* Update js test output expectation
* Rename NameScope::Global to NameScope::Top
* Switch to `enum class`
* Switch to `Fatal()`
* Add TODO for when asm.js is no longer generated
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* wasm2asm: Finish i64 lowering operations
This commit finishes out lowering i64 operations to JS with implementations of
division and remainder for JS. The primary change here is to have these compiled
from Rust to wasm and then have them "linked in" via intrinsics. The
`RemoveNonJSOps` pass has been updated to include some of what
`I64ToI32Lowering` was previously doing, basically replacing some instructions
with calls to intrinsics. The intrinsics are now all tracked in one location.
Hopefully the intrinsics don't need to be regenerated too much, but for
posterity the source currently [lives in a gist][gist], although I suspect that
gist won't continue to compile and work as-is for all of time.
[gist]: https://gist.github.com/alexcrichton/e7ea67bcdd17ce4b6254e66f77165690
|
|
|
|
|
|
|
|
|
| |
This commit lifts the same conversion strategy that `emcc` takes to convert
between floats point numbers and integers, and it should implement all the
various matrices of i32/u32/i64/u64 to f32/f64
Some refactoring was performed in the i64->i32 pass to allow for temporary
variables to get allocated which have types other than i32, but otherwise this
contains a pretty direct translation of `emcc`'s operations to `wasm2asm`.
|
|
|
|
| |
Not much fancy here, but rather each operation is naively lowered inline to the
if/else chain to execute it.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As mentioned in #1458 a naive implementation of these instructions is to round
trip the value through address 0 in linear memory. Also pointed out in #1458
this isn't necessarily valid for all languages. For now, though, languages like
Rust, C, and C++ would likely be horribly broken if valid data could be stored
at low addresses, so this commit goes ahead and adds an implementation of the
reinterpretation instructions by traveling data through address 0. This will
likely need an update if a language comes a long which can validly store data in
the first 8 bytes of linear memory, but it seems like that won't happen in the
near future.
Closes #1458
|
|
|
|
| |
Mostly piggy-back pon the previous 64-bit shift lowering code, just filling in a
few gaps.
|