| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
(#1356)
|
|
|
| |
We can remove the memory/table (itself, or an import if imported) if they are not used. This is pretty minor on a large wasm file, but when reading small wasts it's very noticeable to have an unused memory and table all the time.
|
|
|
|
|
| |
Instead merge constant-offset segments if we must in order to stay under the limit.
If we can't - too many non-constant-offset segments - then issue a warning.
|
|
|
|
| |
This optimizes #1343. It looks for stores of a value that is already present in the local, which in particular can remove the initial set to 0 of loops starting at zero, since all locals are initialized to that already. This helps in real-world code, but is not super-common since coalescing means we tend to have assigned something else to it anyhow before we need it to be zero, so this mainly helps in small functions (and running this before coalescing would extend live ranges in potentially bad ways).
|
|
|
|
| |
It was returning the top of the allocated space rather than the bottom.
Fix taken from @tbfleming in kripken/emscripten#5974
|
|
|
|
|
|
|
| |
* add get_global/set_global validation
* validate get_local index
* update builds
* fix tests
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* binaryen.js and wasm.js don't need filesystem support
* newest emscripten no longer uses Runtime.*
* build fixes for binaryen.js and wasm.js also move binaryen.js to use standard emscripten MODULARIZE
* run binaryen.js in all possible engines ; update js builds
* don't emit debug build to a different name, just emit binaryen.js. makes testing easier and safer
* remove volatile things from binaryen.js info printing in tests
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an experiment to help with Boehm-style GC. It will spill things that could be pointers to the C stack, so that they can be seen by conservative garbage collection.
The spills add code size and runtime overhead, but actually less than I thought: 10% slower (smaller than the difference between VMs), 15% gzip size larger. We can do even better with more optimizations for this, like a dead store elimination pass.
This PR does the following:
* Add the new pass.
* Create an abi/ dir, with info about the pointer size and stack manipulation utilities.
* Separates out the liveness analysis from CoalesceLocals, so that other passes can use it (like SpillPointers).
* Refactor out the SortedVector class from the liveness analysis to a separate file (just seems nicer that way).
|
| |
|
|
|
|
|
|
|
|
|
| |
This optimizes the situation described in #1331. Namely, when x is copied into y, then on subsequent gets of x we could use y instead, and vice versa, as their value is equal. Specifically, this seems to get rid of the definite overlap in the live ranges of x and y, as removing it allows coalesce-locals to merge them. The pass therefore does nothing if the live range of y ends there anyhow.
The danger here is that we may extend the live range so that it causes more conflicts with other things, so this is a heuristic, but I've tested it on every codebase I can find and it always produces a net win, even on one I saw a 0.4% reduction of code size, which surprised me.
This is a fairly slow pass, because it uses LocalGraph which isn't much optimized. This PR includes a minor optimization for it, but we should rewrite it. Meanwhile this is just enabled in -O3 and -Oz.
This PR also includes some fuzzing improvements, to better test stuff like this.
|
|
|
|
|
|
| |
* Check if there is a currFunction before using it (we need it for some stacky code; a valid wasm wouldn't need a function in that location anyhow, as what can be put in a memory/table offset is very limited).
* Huge alignment led us to do a power of 2 shift that is undefined behavior.
Also adds a test facility to check we don't crash on testcases.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* ignore missing imports (the wasm may have already had them optimized out)
* handle segments that hold on to globals (root them, for now, as we can't remove segments)
* run reorder-functions, as the optimal order may have changed after we dce
* fix global, global init, and segment offset reachability
* fix import rooting and processing - imports may be imported more than once
|
|
|
|
| |
* binaryen.js improvements: block default value is none, not undefined, and add text-format style aliases for things like getLocal (so you can write get_local as in the text format)
|
|
|
|
|
|
|
| |
This adds a new tool for better dead code elimination. The problem this helps overcome is when the wasm module is part of something larger, like a wasm+JS combination, and therefore doing DCE in either one is not sufficient as it can't remove a cycle spanning the wasm and JS worlds. Concretely, when binaryen performs DCE by itself, it can never remove an export, because it considers those roots - but in the larger ("meta") space outside, they may actually be removable.
To solve that, this tool receives a description of the outside graph (in very abstract form), including which nodes are roots. It then adds to that graph nodes from the wasm, so that we have a single graph representing the entire space (the outside + wasm + connections between them). It then performs DCE, finding what is not reachable from the roots, and cleaning it up from the wasm. It of course can't clean up things from the outside, since all it has is the abstract representation of those things in the graph, but it prints out the ids of the removable nodes, which an outside tool can use.
This tool is written in as general a way as possible, hopefully it can have multiple uses. The use I have in mind is to write something in emscripten that uses this to DCE the JS+wasm combination that we emit.
|
|
|
| |
* also fixes optimizing them in Precompute
|
|
|
|
| |
* support debug info without a filename in asm2wasm input (which can happen if llvm doesn't know the file, only the line)
|
|
|
|
| |
fixes for multiple segments, which we never really printed that prettily (#1316)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements #1309: subsequent br_ifs that compare the same value to various constants are converted into a br_table in a block,
(br_if $x (i32.eq (get_local $a) (i32.const 0)))
(br_if $y (i32.eq (get_local $a) (i32.const 1)))
(br_if $z (i32.eq (get_local $a) (i32.const 2)))
==>
(block $tablify
(br_table $x $y $z $tablify
(get_local $a)
)
)
The constants for when to apply this (e.g., not if the range of values would make a huge jump table) are fairly conservative, I think, but hard to tell. Probably should be tweaked based on our experience with the pass in practice later on.
|
| |
|
|
|
|
|
| |
(eqz X) and (eqz Y) === eqz (X or Y)
Normally de-morgan's laws apply only to boolean vars, but for the and (but not or or xor) version, it works in all cases (both sides are true iff X and Y have all zero bits).
|
|
|
|
| |
input values (#1303)
|
|
|
|
| |
then since the parent blocks do not have such values, we can finalize them with their type as a concrete type should not vanish (#1302)
|
|
|
| |
Currenty throws if omitted, see AssemblyScript/binaryen.js#2
|
|
|
|
|
|
| |
* remove unneeded code to handle a br to the return from the function. Now that we use getBlockOrSingleton there, it does that for us anyhow
* fix a fuzz bug of popping from outside a block
|
|
|
|
| |
* Provide AddImport/AddExport for each element in the C-API
|
|
|
|
| |
* remove unneeded code to handle a br to the return from the function. Now that we use getBlockOrSingleton there, it does that for us anyhow
|
| |
|
|
|
|
| |
* fix wasm-reduce when out-of-tree: do not use a hardcoded bin/wasm-opt, instead add a Path namespace with utilities to get the proper path, and use BINARYEN_ROOT which our test setup code ensures
|
|
|
| |
* Also other function utilities in C and JS APIs
|
| |
|
|
|
|
| |
* fix a code-folding bug where when merging function-level tails, we moved code out of where it could reach a break target - we must not move code if it has a break target not enclosed in itself. the EffectAnalyzer already had the functionality for that, move the code around a little there to make that clearer too
|
|
|
|
|
| |
* flatten tee_local in flatten, as it leads to more-optimizable code (tee_local, when nested, can introduce side effects in bad places).
* also fix some test stuff from recent merges
|
|
|
|
| |
* fix if copying - we should preserve the forced explicit type if there is one, and not just infer it from the arms. this adds a builder method for makeIf that receives a type to apply to the if, and for blocks a method that makes a block from a list, also with a variant with a provided type
|
| |
|
| |
|
|
|
|
|
|
| |
* add i64_atomics_* support to asm2wasm
* OptimizeInstructions: atomic loads can't be signed
|
| |
|
| |
|
|
|
|
| |
Function type gets its own element rather than being a part of the call_indirect
(see WebAssembly/spec#599)
|
|
|
| |
Now also includes a test.
|
| |
|
|
|
|
| |
* add fuzz-pass option, which picks random passes to fuzz in each wasm-opt invocation
|
| |
|
|
|
|
|
|
| |
* Fixed use of undefined 'types' array in BinaryenAddGlobal tracing
* also fix use of 'expressions'
|
|
|
|
| |
* fix relooper bug, ensure function body has right type, as relooper output does not flow stuff out, but wasm functions with a result do expect a flow value, so none is not an option. in other words, as the docs say, a relooper block must end with a terminator (return, unreachable, break, etc.) and not flow out.
|
|
|
|
|
|
| |
* Added BinaryenAtomicRMW incl. ops to binaryen-c
* AtomicCmpxchg, AtomicWait, AtomicWake
|
|
|
|
| |
flags (#1277)
|
|
|
|
| |
binaryen-c (#1270)
|