| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
This was never right for over a decade, and just never used I suppose... it should
have been called "take" since it grabbed data from the other item and then set
that other item to empty. Fix it so it swaps properly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an IRBuilder utility in a new wasm-ir-builder.h header. IRBuilder is
extremely similar to Builder, except that it manages building full trees of
Binaryen IR from a linear sequence of instructions, whereas Builder only builds
a single IR node at a time. To build full IR trees, IRBuilder maintains an
internal stack of expressions, popping children off the stack and pushing the
new node onto the stack whenever it builds a new node.
In addition to providing makeXYZ function to allocate, initialize, and finalize
new IR nodes, IRBuilder also provides a visit() method that can be used when the
user has already allocated the IR nodes and only needs to reconstruct the
connections between them. This will be useful in outlining both for constructing
outlined functions and for reconstructing functions around arbitrary outlined
holes.
Besides the new wat parser and outlining, this new utility can also eventually
be used in the binary parser and to convert from Poppy IR back to Binaryen IR if
that ever becomes necessary.
To simplify this initial change, IRBuilder exposes the same interface as the
code it replaces in the wat parser. A future change requiring more extensive
changes to the wat parser will simplify this interface. Also, since the new code
is tested only via the new wat parser, it only supports building instructions
that were already supported by the new wat parser to avoid trying to support any
instructions without corresponding testing. Implementing support for the
remaining instructions is left as future work.
|
|
|
|
|
|
|
|
| |
Replace the different overloads we previously had for different kinds of
containers with generic templates. We still need dedicated overloads for
`std::initializer_list` because it is never inferred in a template context,
though. Also, since `std::initializer_list` does not allow subscripting, update
the arena vector implementation to use iterators instead now that initializer
lists can be passed down to that layer without being reified as vectors.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To reduce allocator contention, MixedArena transparently forwards allocations to a
thread-local arena by traversing a linked list of arenas, looking for the one
corresponding to the current thread. If the search reaches the end of the linked
list, it allocates a new arena for the current thread and atomically appends it
to the end of the list using a compare-and-swap operation.
The problem was that the previous code used `compare_exchange_weak`, which is
allowed to spuriously fail, i.e. it is allowed to behave as though the original
value is not the same as the expected value, even when it is. In this case, that
means that it might fail to append the new allocation to the list even if the
`next` pointer is still null, which results in a subsequent null pointer
dereference. The fix is to use `compare_exchange_strong`, which is not allowed
to spuriously fail in this way.
Reported in #3806.
|
|
|
|
|
|
|
|
| |
Followup to #3486, I wonder if it isn't a little more clear this way,
which avoids the confusion of usedElements being changed
while we are using it.
In general I think it's best to only use usedElements in the most
internal methods, and to call size() otherwise.
|
|
|
|
|
|
|
|
| |
Because `resize()` sets `usedElements` to its argument, we were
accessing `data[usedElements]`, which can be outside of allocated memory
depending the internal state, i.e., `allocatedElements`'s value.
It is hard to come up with a test case for this because apparently the
failure condition depends on the vector's internal state.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Renames the following C-API functions
BinaryenBlockGetChild to BinaryenBlockGetChildAt
BinaryenSwitchGetName to BinaryenSwitchGetNameAt
BinaryenCallGetOperand to BinaryenCallGetOperandAt
BinaryenCallIndirectGetOperand to BinaryenCallIndirectGetOperandAt
BinaryenHostGetOperand to BinaryenHostGetOperandAt
BinaryenThrowGetOperand to BinaryenThrowGetOperandAt
BinaryenTupleMakeGetOperand to BinaryenTupleMakeGetOperandAt
Adds the following C-API functions
BinaryenExpressionSetType
BinaryenExpressionFinalize
BinaryenBlockSetName
BinaryenBlockSetChildAt
BinaryenBlockAppendChild
BinaryenBlockInsertChildAt
BinaryenBlockRemoveChildAt
BinaryenIfSetCondition
BinaryenIfSetIfTrue
BinaryenIfSetIfFalse
BinaryenLoopSetName
BinaryenLoopSetBody
BinaryenBreakSetName
BinaryenBreakSetCondition
BinaryenBreakSetValue
BinaryenSwitchSetNameAt
BinaryenSwitchAppendName
BinaryenSwitchInsertNameAt
BinaryenSwitchRemoveNameAt
BinaryenSwitchSetDefaultName
BinaryenSwitchSetCondition
BinaryenSwitchSetValue
BinaryenCallSetTarget
BinaryenCallSetOperandAt
BinaryenCallAppendOperand
BinaryenCallInsertOperandAt
BinaryenCallRemoveOperandAt
BinaryenCallSetReturn
BinaryenCallIndirectSetTarget
BinaryenCallIndirectSetOperandAt
BinaryenCallIndirectAppendOperand
BinaryenCallIndirectInsertOperandAt
BinaryenCallIndirectRemoveOperandAt
BinaryenCallIndirectSetReturn
BinaryenCallIndirectGetParams
BinaryenCallIndirectSetParams
BinaryenCallIndirectGetResults
BinaryenCallIndirectSetResults
BinaryenLocalGetSetIndex
BinaryenLocalSetSetIndex
BinaryenLocalSetSetValue
BinaryenGlobalGetSetName
BinaryenGlobalSetSetName
BinaryenGlobalSetSetValue
BinaryenHostSetOp
BinaryenHostSetNameOperand
BinaryenHostSetOperandAt
BinaryenHostAppendOperand
BinaryenHostInsertOperandAt
BinaryenHostRemoveOperandAt
BinaryenLoadSetAtomic
BinaryenLoadSetSigned
BinaryenLoadSetOffset
BinaryenLoadSetBytes
BinaryenLoadSetAlign
BinaryenLoadSetPtr
BinaryenStoreSetAtomic
BinaryenStoreSetBytes
BinaryenStoreSetOffset
BinaryenStoreSetAlign
BinaryenStoreSetPtr
BinaryenStoreSetValue
BinaryenStoreGetValueType
BinaryenStoreSetValueType
BinaryenConstSetValueI32
BinaryenConstSetValueI64
BinaryenConstSetValueI64Low
BinaryenConstSetValueI64High
BinaryenConstSetValueF32
BinaryenConstSetValueF64
BinaryenConstSetValueV128
BinaryenUnarySetOp
BinaryenUnarySetValue
BinaryenBinarySetOp
BinaryenBinarySetLeft
BinaryenBinarySetRight
BinaryenSelectSetIfTrue
BinaryenSelectSetIfFalse
BinaryenSelectSetCondition
BinaryenDropSetValue
BinaryenReturnSetValue
BinaryenAtomicRMWSetOp
BinaryenAtomicRMWSetBytes
BinaryenAtomicRMWSetOffset
BinaryenAtomicRMWSetPtr
BinaryenAtomicRMWSetValue
BinaryenAtomicCmpxchgSetBytes
BinaryenAtomicCmpxchgSetOffset
BinaryenAtomicCmpxchgSetPtr
BinaryenAtomicCmpxchgSetExpected
BinaryenAtomicCmpxchgSetReplacement
BinaryenAtomicWaitSetPtr
BinaryenAtomicWaitSetExpected
BinaryenAtomicWaitSetTimeout
BinaryenAtomicWaitSetExpectedType
BinaryenAtomicNotifySetPtr
BinaryenAtomicNotifySetNotifyCount
BinaryenAtomicFenceSetOrder
BinaryenSIMDExtractSetOp
BinaryenSIMDExtractSetVec
BinaryenSIMDExtractSetIndex
BinaryenSIMDReplaceSetOp
BinaryenSIMDReplaceSetVec
BinaryenSIMDReplaceSetIndex
BinaryenSIMDReplaceSetValue
BinaryenSIMDShuffleSetLeft
BinaryenSIMDShuffleSetRight
BinaryenSIMDShuffleSetMask
BinaryenSIMDTernarySetOp
BinaryenSIMDTernarySetA
BinaryenSIMDTernarySetB
BinaryenSIMDTernarySetC
BinaryenSIMDShiftSetOp
BinaryenSIMDShiftSetVec
BinaryenSIMDShiftSetShift
BinaryenSIMDLoadSetOp
BinaryenSIMDLoadSetOffset
BinaryenSIMDLoadSetAlign
BinaryenSIMDLoadSetPtr
BinaryenMemoryInitSetSegment
BinaryenMemoryInitSetDest
BinaryenMemoryInitSetOffset
BinaryenMemoryInitSetSize
BinaryenDataDropSetSegment
BinaryenMemoryCopySetDest
BinaryenMemoryCopySetSource
BinaryenMemoryCopySetSize
BinaryenMemoryFillSetDest
BinaryenMemoryFillSetValue
BinaryenMemoryFillSetSize
BinaryenRefIsNullSetValue
BinaryenRefFuncSetFunc
BinaryenTrySetBody
BinaryenTrySetCatchBody
BinaryenThrowSetEvent
BinaryenThrowSetOperandAt
BinaryenThrowAppendOperand
BinaryenThrowInsertOperandAt
BinaryenThrowRemoveOperandAt
BinaryenRethrowSetExnref
BinaryenBrOnExnSetEvent
BinaryenBrOnExnSetName
BinaryenBrOnExnSetExnref
BinaryenTupleMakeSetOperandAt
BinaryenTupleMakeAppendOperand
BinaryenTupleMakeInsertOperandAt
BinaryenTupleMakeRemoveOperandAt
BinaryenTupleExtractSetTuple
BinaryenTupleExtractSetIndex
BinaryenFunctionSetBody
Also introduces wrappers to the JS-API resembling the classes in C++
to perform the above operations on an expression. For example:
var unary = binaryen.Unary(module.i32.eqz(1));
unary.getOp(...) / .op
unary.setOp(...) / .op = ...
unary.getValue(...) / .value
unary.setValue(...) / .value = ...
unary.getType(...) / .type
unary.finalize()
...
Usage of wrappers is optional, and one can also use plain functions:
var unary = module.i32.eqz(1);
binaryen.Unary.getOp(unary, ...)
...
Also adds comments to all affected functions in case we'd like to generate
API documentation at some point.
|
|
|
| |
Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
|
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
|
| |
(Legacy)RandomAccessIterator concept (#1962)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The error in #1845 shows:
/<<PKGBUILDDIR>>/src/mixed_arena.h: In member function 'void* MixedArena::allocSpace(size_t, size_t)':
/<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: error: 'new' of type 'MixedArena::Chunk' {aka 'std::aligned_storage<32768, 16>::type'} with extended alignment 16 [-Werror=aligned-new=]
chunks.push_back(new Chunk[numChunks]);
^
/<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: note: uses 'void* operator new [](std::size_t)', which does not have an alignment parameter
/<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: note: use '-faligned-new' to enable C++17 over-aligned new support
It turns out I had misread the aligned_storage docs, and they don't actually do what we need, which is a convenient cross-platform way to do aligned allocation, since new itself doesn't support that. Sadly it seems there is no cross-platform way to do it right now, so I added a header in support which abstracts over the windows and everything-else ways.
Also add some ctest testing, which runs on windows, so we get basic windows coverage in our CI.
|
| |
|
|
|
| |
Necessary for simd, as we add a type with alignment >8. We were just broken on that before this PR.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a pass to remove unnecessary call arguments in an LTO-like manner, that is:
* If a parameter is not actually used in a function, we don't need to send anything, and can remove it from the function's declaration. Concretely,
(func $a (param $x i32)
..no uses of $x..
)
(func $b
(call $a (..))
)
=>
(func $a
..no uses of $x..
)
(func $b
(call $a)
)
And
* If a parameter is only ever sent the same constant value, we can just set that constant value in the function (which then means that the values sent from the outside are no longer used, as in the previous point). Concretely,
(func $a (param $x i32)
..may use $x..
)
(func $b
(call $a (i32.const 1))
(call $a (i32.const 1))
)
=>
(func $a
(local $x i32)
(set_local $x (i32.const 1)
..may use $x..
)
(func $b
(call $a)
(call $a)
)
How much this helps depends on the codebase obviously, but sometimes it is pretty useful. For example, it shrinks 0.72% on Unity and 0.37% on Mono. Note that those numbers include not just the optimization itself, but the other optimizations it then enables - in particular the second point from earlier leads to inlining a constant value, which often allows constant propagation, and also removing parameters may enable more duplicate function elimination, etc. - which explains how this can shrink Unity by almost 1%.
Implementation is pretty straightforward, but there is some work to make the heavy part of the pass parallel, and a bunch of corner cases to avoid (can't change a function that is exported or in the table, etc.). Like the Inlining pass, there is both a standard and an "optimizing" version of this pass - the latter also optimizes the functions it changes, as like Inlining, it's useful to not need to re-run all function optimizations on the whole module.
|
|
|
|
|
|
|
|
|
|
| |
Inspired by #1501
* remove unneeded appearances of the default switch target (at the front or back of the list of targets)
* optimize a switch with 0, 1 or 2 targets into an if or if-chain
* optimize a br_if br pair when they have the same target
Makes e.g. fastcomp libc++ 2% smaller. Noticeable improvements on other things like box2d etc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Get wasm2asm building again
Updates CMakeLists.txt to have wasm2asm built by default, updates
wasm2asm.h to account for recent interface changes, and restores
JSPrinter functionality.
* Implement splice for array values
* Clean up wasm2asm testing
* Print semicolons after statements in blocks
* Cleanups and semicolons for condition arms
* Prettify semicolon emission
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a pass that folds code, i.e. merges it when possible. See details in comment in the pass implementation cpp.
This is enabled by default in -Os and -Oz. Seems risky to enable anywhere else, as it does add branches - likely predictable ones so maybe no slowdown, but still some risk.
Code size numbers:
wasm-backend: 196331
+ binaryen -Os (before): 182598
+ binaryen -Os (with folding): 181943
asm2wasm -Os (before): 172463
asm2wasm -Os (with folding): 168774
So this reduces wasm-backend output by an additional 0.5% than it could before. Mainly this is because the wasm backend already has code folding, whereas on asm2wasm output, where we didn't have folding before, this saves over 2%. The 0.5% improvement on the wasm backend's output might be because this can fold more types of code than LLVM can (it can fold nested control flow, in particular).
|
|
|
|
|
|
|
|
|
|
|
|
| |
* parse file/line comments in asm.js into debug intrinsics
* convert debug intrinsics into annotations, and print them
* ignore --debuginfo if not emitting text, as wasm binaries don't support that yet
* emit full debug info when -g and emitting text; when -g and emitting binary, all we can do is the Names section
* update wasm.js
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Under emscripten, C code can take the address of a function implemented
in Javascript (which is exposed via an import in wasm). Because imports
do not have linear memory address in wasm, we need to generate a thunk
to be the target of the indirect call; it call the import directly.
This is facilited by a new .s directive (.functype) which declares the
types of functions which are declared but not defined.
Fixes https://github.com/WebAssembly/binaryen/issues/392
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
Still making things nicer for #370
Pulling wasm-linker into its own file also necessitated pulling asm_v_wasm.h into a cpp file. It goes into a new lib directory, src/asmjs.
No actual code changes in this PR.
|
|
|
|
| |
* allow traversals to mark themselves as function-parallel, in which case we run them using a thread pool. also mark some thread-safety risks (interned strings, arena allocators) with assertions they modify only on the main thread
|
| |
|
|
|
|
| |
This applies Apache 2.0 properly (as far as our lawyers have told me). We can do this early since all of the code was written by Alon Zakai.
|
| |
|
| |
|
| |
|
|
|