summaryrefslogtreecommitdiff
path: root/test
Commit message (Collapse)AuthorAgeFilesLines
...
* -O4: When -O3 isn't enough (#1596)Alon Zakai2018-06-082-0/+36
| | | | | | | | | This defines a new -O4 optimization mode, as flatten + flat-only opts (currently local-cse) + -O3. In practice, flattening is not needed for LLVM output, which is pretty flat already (no block or if values, etc., even if it does use tees and does nest expressions; and LLVM has already done gvn etc. anyhow). In general, though, wasm generated by a non-LLVM compiler may naturally be nested because wasm allows that. See for example #1593 where an AssemblyScript testcase requires flattening to be fully optimized. So -O4 can help there. -O4 takes 3x longer to run than -O3 in my testing, basically because flat IR is much bigger. But when it's useful it may be worth it. It does handle that AssemblyScript testcase and others like it. There's not much big real-world code that isn't LLVM yet, but running the fuzzer - which happily creates nested stuff all the time - I see -O4 consistently shrink the size by around 20% over -O3.
* Improve local-cse (#1594)Alon Zakai2018-06-0811-249/+1103
| | | | | This makes it much more effective, by rewriting it to depend on flatten. In flattened IR, it is very simple to check if an expression is equivalent to one already available for use in a local, and use that one instead, basically we just track values in locals. Helps with #1521
* wasm-opt source map support (#1557)Alon Zakai2018-06-071-0/+192
| | | | | | | | | | * support source map input in wasm-opt, refactoring the loading code into wasm-io * use wasm-io in wasm-as * support output source maps in wasm-opt * add a test for wasm-opt and source maps
* duplicate-function-elimination improvements (#1590)Alon Zakai2018-06-075-6/+2295
| | | | | | | On a codebase with 370K functions, 160K were in fact duplicate (!)... and it took many many passes to figure that out, over 2 minutes in fact (!), as A and B may be identical only after we see that the functions C1, C2 that they call are identical (so there can be long "chains" here). To avoid this, limit how many passes we do. In -O1, just do one pass - that gets most duplicates. In -O2, do 10 passes - that gets almost all of it on this codebase. And in -O3 (or -Os/-Oz) do as many passes as necessary (i.e., the old behavior). This at least lets iteration builds (-O1) be nice and fast. This PR also refactors the hashing code used in that pass, moving it to nicer header files for clearer readability. Also some other minor cleanups in hashing code that helped debug this.
* Increase flake8 coverage (#1586)Sam Clegg2018-06-051-8/+8
|
* run precompute-propagate early, when we would run it also late, as it is ↵Alon Zakai2018-06-042-258/+4
| | | | helpful in both positions on general code (#1581)
* Always incorporate the table segment offset when calculating ↵Jacob Gravelle2018-06-011-1/+1
| | | | jsCallStartIndex (#1579)
* Optimize validation of many nested blocks (#1576)Alon Zakai2018-05-302-26/+0
| | | | | | | On the testcase from https://github.com/tweag/asterius/issues/19#issuecomment-393052653 this makes us almost 3x faster, and use 25% less memory. The main improvement here is to simplify and optimize the data structures the validator uses to validate br targets: use unordered maps, and use one less of them. Also some speedups from using that map more effectively (use of iterators to avoid multiple lookups). Also move the duplicate-node checks to the internal IR validation section, which makes more sense anyhow (it's not wasm validation, it's internal IR validation, which like the check for stale internal types, we do only if debugging).
* wasm2asm: Fix and enable a large number of spec tests (#1558)Alex Crichton2018-05-2960-683/+61411
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Import `abort` from the environment * Add passing spec tests * Bind the abort function * wasm2asm: Fix name collisions Currently function names and local names can collide in namespaces, causing buggy results when a function intends to call another function but ends up using a local value as the target! This fix was required to enable the `fac` spec test * wasm2asm: Get multiple modules in one file working The spec tests seem to have multiple modules defined in some tests and the invocations all use the most recently defined module. This commit updates the `--allow-asserts` mode of wasm2asm to work with this mode of tests, enabling us to enable more spec tests for wasm2asm. * wasm2asm: Enable the float_literals spec test This needed to be modified to account for how JS engines don't work with NaN bits the same way, but it's otherwise largely the same test. Additionally it turns out that asm.js doesn't accept either `Infinity` or `NaN` ambient globals so they needed to get imported through the `global` variable rather than defined as literals in code * wasm2asm: Fix function pointer invocations This commit fixes invocations of functions through function pointers as previously the table names on lookup and definition were mismatched. Both tables now go through signature-based namification rather than athe name of the type itself. Overall this enables a slew of spec tests * wasm2asm: Enable the left-to-right spec test There were two small bugs in the order of evaluation of operators with wasm2asm. The `select` instruction would sometimes evaluate the condition first when it was supposed to be last. Similarly a `call_indirect` instruction would evaluate the function pointer first when it was supposed to be evaluated last. The `select` instruction case was a relatively small fix but the one for `call_indirect` was a bit more pessimized to generate some temporaries. Hopefully if this becomes up a problem it can be tightened up. * wasm2asm: Fix signed load promotions of 64-bit ints This commit enables the `endianness` spec test which revealed a bug in 64-bit loads from smaller sizes which were signed. Previously the upper bits of the 64-bit number were all set to zero but the fix was for signed loads to have all the upper bits match the highest bit of the low 32 bits that we load. * wasm2asm: Enable the `stack` spec test Internally the spec test uses a mixture of the s-expression syntax and the wat syntax, so this is copied over into the `wasm2asm` folder after going through `wat2wasm` to ensure it's consistent for binaryen. * wasm2asm: Fix unaligned loads/stores of floats Replace these operations in `RemoveNonJSOps` by using reinterpretation to translate floats to integers and then use the existing code for unaligned loads/stores of integers. * wasm2asm: Fix a tricky grow_memory codegen bug This commit fixes a tricky codegen bug found in the `grow_memory` instruction. Specifically if you stored the result of `grow_memory` immediately into memory it would look like: HEAP32[..] = __wasm_grow_memory(..); Here though it looks like JS evaluates the destination *before* the grow function is called, but the grow function will invalidate the destination! Furthermore this is actually generalizable to all function calls: HEAP32[..] = foo(..); Because any function could transitively call `grow_memory`. This commit fixes the issue by ensuring that store instructions are always considered statements, unconditionally evaluating the value into a temporary and then storing that into the destination. While a bit of a pessmimization for now it should hopefully fix the bug here. * wasm2asm: Handle offsets in tables This commit fixes initializing tables whose elements have an initial offset. This should hopefully help fix some more Rust code which has all function pointers offset by default! * Update tests * Tweak * location on types * Rename entries of NameScope and document fromName * Comment on lowercase names * Update compiled JS * Update js test output expectation * Rename NameScope::Global to NameScope::Top * Switch to `enum class` * Switch to `Fatal()` * Add TODO for when asm.js is no longer generated
* Cleanup scripts in scripts/test (#1566)Sam Clegg2018-05-252-12/+26
| | | | | | | | | | Remove executable bit and #! from scripts that don't have entry point. Add missing licence test. Move arg parsing into a function. Remove legacy --only_prepare (with underscrore) argument.
* wasm2asm: Finish i64 lowering operations (#1563)Alex Crichton2018-05-2513-411/+3385
| | | | | | | | | | | | | | | | | * wasm2asm: Finish i64 lowering operations This commit finishes out lowering i64 operations to JS with implementations of division and remainder for JS. The primary change here is to have these compiled from Rust to wasm and then have them "linked in" via intrinsics. The `RemoveNonJSOps` pass has been updated to include some of what `I64ToI32Lowering` was previously doing, basically replacing some instructions with calls to intrinsics. The intrinsics are now all tracked in one location. Hopefully the intrinsics don't need to be regenerated too much, but for posterity the source currently [lives in a gist][gist], although I suspect that gist won't continue to compile and work as-is for all of time. [gist]: https://gist.github.com/alexcrichton/e7ea67bcdd17ce4b6254e66f77165690
* wasm2asm: Finish f32/f64 operations (#1554)Alex Crichton2018-05-1929-1109/+1668
|
* Fix optimizing equivalent locals bug introduced in #1540 (#1556)Alon Zakai2018-05-172-3/+38
| | | Don't skip through flowing tee values, just drop the current outermost which we find is redundant. the child tees may still be necessary.
* wasm2asm: Implement float<->int conversions (#1550)Alex Crichton2018-05-166-7/+929
| | | | | | | | | This commit lifts the same conversion strategy that `emcc` takes to convert between floats point numbers and integers, and it should implement all the various matrices of i32/u32/i64/u64 to f32/f64 Some refactoring was performed in the i64->i32 pass to allow for temporary variables to get allocated which have types other than i32, but otherwise this contains a pretty direct translation of `emcc`'s operations to `wasm2asm`.
* wasm-emscripten: Don't use debug names in implementedFunctions (#1537)Sam Clegg2018-05-1528-28/+28
| | | | | | | | | | | | | | implementFunctions should use the export names, not the internal/debug name for a function. This is especially imported with lld where the debug names are demanagled. implementFunctions should only contain functions that are accessible from outside the module. i.e. those that have been exported. There is no point in adding internal-only functions to this list as they won't be accessible from outside anyway. Tesed with emscripten using: ./tests/runner.py binaryen2.test_time
* wasm2asm: Implement f32/f64.copysign (#1551)Alex Crichton2018-05-154-0/+82
| | | | | | This commit implements the `copysign` instruction for the wasm2asm binary. The implementation here is a new pass which wholesale replaces `copysign` instructions with the equivalent bit ops and reinterpretation instructions. It's intended that this matches Emscripten's implementation of lowering here.
* In full-printing mode, print comments for control flow endings, to help ↵Alon Zakai2018-05-141-0/+3
| | | | | | | | | | | readability (#1552) Like this: (block $x .. ) ;; end block $x Also fix some current breakage on master.
* wasm2asm: Add math aliases for floor, ceil and sqrt (#1549)Daniel Wirtz2018-05-1419-1/+111
|
* Implement 64-bit rotation lowering for wasm2asm (#1545)Alex Crichton2018-05-142-0/+215
| | | | Not much fancy here, but rather each operation is naively lowered inline to the if/else chain to execute it.
* wasm2asm: Implement reinterpretation instructions (#1547)Alex Crichton2018-05-132-0/+101
| | | | | | | | | | | | | As mentioned in #1458 a naive implementation of these instructions is to round trip the value through address 0 in linear memory. Also pointed out in #1458 this isn't necessarily valid for all languages. For now, though, languages like Rust, C, and C++ would likely be horribly broken if valid data could be stored at low addresses, so this commit goes ahead and adds an implementation of the reinterpretation instructions by traveling data through address 0. This will likely need an update if a language comes a long which can validly store data in the first 8 bytes of linear memory, but it seems like that won't happen in the near future. Closes #1458
* Clean up wasm2asm testing (#1546)Alon Zakai2018-05-1315-0/+0
| | | | | * Move wasm2asm test outputs into their natural location, test/wasm2asm/ * Let people create new tests in there that ./auto_update_tests.py will auto-generate outputs for, just like all the other tests.
* Implement signed 64-bit shift right for wasm2asm (#1544)Alex Crichton2018-05-122-0/+196
| | | | Mostly piggy-back pon the previous 64-bit shift lowering code, just filling in a few gaps.
* Merge loop tails up (#1543)Alon Zakai2018-05-1018-7487/+8064
| | | | | | | | | | | | | | | E.g. ``` (block .. (loop $l .. (br_if $l (..)) .. code that does not branch to the loop top ) .. that code could be moved here .. ) ``` Moving the code out of the loop may help the loop body become a singleton expression, and is more readable anyhow.
* Move the renaming of llvm-generated __invoke_XX functions from s2wasm into ↵Sam Clegg2018-05-101-3/+3
| | | | | | | | | wasm-emscripten (#1539) This allows the same functionality to be used also in wasm-emscripten-finalize (i.e. the lld path).
* Optimize equivalent locals (#1540)Alon Zakai2018-05-1021-7860/+7936
| | | | | | | | | If locals are known to contain the same value, we can * Pick which local to use for a get_local of any of them. Makes sense to prefer the most common, to increase the chance of one dropping to zero uses. * Remove copies between a local and one that we know contains the same value. This is a consistent win, small though, around 0.1-0.2%.
* More reducer improvements (#1533)Alon Zakai2018-05-081-2/+2
| | | | | | * Add a helper class to iterate over all a node's children, and use that when attempting to replace a node with its children. * If a child has a different type than the parent, try to replace the parent with a conversion + the child (for example, a call may receive two f32 inputs and return an i32; we can try to replace the call with one of those f32s and a conversion to an i32). * When possible, try to replace the function body with a child even if the child has a different type, by changing the function return value.
* improve remove-unused-module-elements (#1532)Alon Zakai2018-05-044-0/+657
| | | | | Remove the entire memory/table when possible, in particular, when not imported, exported, or used. Previously we did not look at whether they were imported, so we assumed we could never remove them. Also add a variant that removes everything but functions, which can be useful when reducing a testcase that only cares about code in functions.
* misc minor cleanups in the codebase (#1531)Alon Zakai2018-05-041-1/+1
|
* Fix some fuzz bugs (#1528)Alon Zakai2018-05-014-0/+121
| | | | | * remove-unused-brs: handle an if declared as returning a value despite having an unreachable condition * simplify-locals: don't work on loops while the main pass is making changes, as set_locals are being tracked and modified.
* Generate loop return values in optimizer (#1527)Alon Zakai2018-05-0112-2026/+2035
|
* More simplify-locals opts (#1526)Alon Zakai2018-05-0120-31692/+31739
| | | | | | * Use an if return value when one side is unreachable. * Undo an if return value if we can use a br_if instead
* --simplify-locals-nonesting (#1525)Alon Zakai2018-04-303-29/+479
| | | | | Add a version of simplify-locals which does not create nesting. This keeps the IR flat (in the sense of --flatten). Also refactor simpify-locals to be a template, so the various modes are all template parameters.
* flatten improvement (#1522)Alon Zakai2018-04-302-107/+173
|
* do more optimizations after inlining: precompute-propagate plus all regular ↵Alon Zakai2018-04-3017-1971/+1528
| | | | opts (#1523)
* add --converge option to wasm-opt (#1524)Alon Zakai2018-04-302-0/+761
| | | | | The option keeps running the passes (that we were told to run) in cycles until we converge in terms of the binary size, that is, keep optimizing until we can't shrink any more. Also fix a --metrics bug this uncovered: we can't expect the Metrics object to still be around if running passes later in another PassRunner.
* optimize selects of constant conditions (#1516)Alon Zakai2018-04-272-14/+95
|
* precompute-propagate may benefit from multiple passes (#1518)Alon Zakai2018-04-272-0/+23
| | | One pass may remove code that includes a tee which then makes more optimization possible. Found by the Souper investigations.
* code-folding improvements (#1512)Alon Zakai2018-04-2611-99/+151
| | | | | | | | Noticed by Souper. * We only folded identical code in an if-else when both arms were blocks, so we were missing the case of one arm being just a singleton expression. This PR will wraps that in a block so the rest of the optimization can work on it, if it sees it is going to be folded out. Turns out this is common for phis. * We only ran code-folding in -Os, because I assumed it was just good for code size, but as it may remove phis in the wasm VM later, seems like we should run it when not optimizing for size as well. Together, these two shrink lua -O3 by almost 1%.
* Improve precompute-propagate (#1514)Alon Zakai2018-04-264-36/+53
| | | Propagate constants through a tee_local. Found by Souper. Details in patch comments - basically we didn't differentiate precomputing a value and an expression.
* More math opts (#1507)Alon Zakai2018-04-112-0/+69
| | | `xor` of 0, `and` and `or` of -1
* More simple math opts (#1506)Alon Zakai2018-04-112-4/+359
| | | | | * Optimize shifts of 0. * Optimize f(x, x) for various f (e.g., x & x => x).
* Some simple integer math opts (#1504)Alon Zakai2018-04-115-6/+632
| | | | | | | | | Stuff like x + 5 != 2 => x != -3. Also some cleanups of utility functions I noticed while writing this, isTypeFloat => isFloatType. Inspired by https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/gen/generic.rules
* Fix bad param/var type error handling (#1499)Alon Zakai2018-04-102-0/+344
| | | Improve error handling, validation, and assertions for having a non-concrete type in an inappropriate place. Fixes a fuzz testcase.
* br_table optimizations (#1502)Alon Zakai2018-04-1017-111/+566
| | | | | | | | | | Inspired by #1501 * remove unneeded appearances of the default switch target (at the front or back of the list of targets) * optimize a switch with 0, 1 or 2 targets into an if or if-chain * optimize a br_if br pair when they have the same target Makes e.g. fastcomp libc++ 2% smaller. Noticeable improvements on other things like box2d etc.
* Handle literally unreachable brs (#1497)Alon Zakai2018-04-072-0/+9
| | | | | The optimization in #1495 had a bug which was found by the fuzzer: our binary format parsing will not emit unreachable code (it may be stacky, so we ignore it). However, while parsing it we note breaks that are taken there, and then we removed that code, leading to a state where a break was not taken in the code, but we thought it was. This PR clarifies the difference between unreachable code in the wasm sense (anything from the start of a block til an unreachable is "reachable") and the literal sense (even that code at the start may not be literally reachable if the block is not reachable), and then we use literal unreachability to know what code will be ignored and therefore we should ignore breaks in.
* Use .set instead of = for aliases (#1491)Heejin Ahn2018-03-301-4/+4
| | | | | | | | | | | | llvm-mirror/llvm@9273bb3([Phabricator](https://reviews.llvm.org/D44256)) changed alias assignment syntax from ``` x = y ``` to ``` .set x, y ``` This patch reflects the change.
* remap {get,set}_local indices (#1486)Nathan Froyd2018-03-232-0/+101
| | | | | | | | | | When lowering i64 values in a function, we create new local variables for all of the i64 local variables, one local for the low bits, and one for the high bits. We create a mapping between the old locals and the new as well. During translation, when we encountered a `get_local` that didn't have type `i64`, we skipped it, on the supposition that there was nothing to do. But that's not true; the local it was getting may have been remapped to a new index in the lowered function, and we need to account for that change. Similar logic holds for `set_local`.
* reorder locals in wasm2asm (#1482)Nathan Froyd2018-03-2215-443/+338
| | | | | | | The documentation for the simplify-locals pass suggests running reorder-locals after it to clean up unnecessary locals. wasm2asm wasn't doing this, which meant that generated code had a number of unused variables. A good minimizer will probably clean that up, but let's go ahead and clean it up in wasm2asm itself.
* add the highbits global to the IR (#1483)Nathan Froyd2018-03-2015-0/+15
| | | | | We were using the global to return 64-bit values from functions, but said global wasn't actually present in the IR. This omission caused the generated code to fail validation.
* fix a fuzz bug in fpcast-emu: if the call_indirect we are modifying is ↵Alon Zakai2018-03-192-0/+59
| | | | unreachable, the modified version is as well (#1481)