summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* Run legalize-js-interface during wasm-emscripten-finalize (#1653)Sam Clegg2018-08-293-4/+24
| | | | This ensures that 64-bit values are correctly handled on the JS boundary.
* wasm-emscripten-finalize: Don't allow duplicates in 'declares'/'invok… (#1655)Sam Clegg2018-08-291-2/+10
| | | | | Allowing duplicates here was causes emscripten to generate a JS object with duplicate keys.
* Greatly simplify iteration.h (#1654)Alon Zakai2018-08-291-139/+20
| | | This avoids us needing to update it for new Expression types (SIMD, GC, etc.) in the future.
* wasm-emscripten-finalize: make _wasm_call_ctors optional (#1647)Sam Clegg2018-08-281-1/+3
|
* Improve getFallthrough (#1643)Alon Zakai2018-08-272-18/+38
| | | | | That method looks through tee_locals and other operations that receive a value and let it flow through them, like a block's final value, etc. It just handled a few such operations, with this PR all of them should be handled. Also refactor it out of the OptimizeInstructions pass as I think it may be useful for propagating returned constants.
* Souper integration + DataFlow optimizations (#1638)Alon Zakai2018-08-2712-5/+2184
| | | | | | | | | Background: google/souper#323 This adds a --souperify pass, which emits Souper IR in text format. That can then be read by Souper which can emit superoptimization rules. We hope that eventually we can integrate those rules into Binaryen. How this works is we emit an internal "DataFlow IR", which is an SSA-based IR, and then write that out into Souper text. This also adds a --dfo pass, which stands for data-flow optimizations. A DataFlow IR is generated, like in souperify, and then performs some trivial optimizations using it. There are very few things that can do that our other optimizations can't already, but this is also good testing for the DataFlow IR, plus it is good preparation for using Souper's superoptimization output (which would also construct DataFlow IR, like here, but then do some matching on the Souper rules).
* Fix value flowing in remove-unused-brs (#1639)Alon Zakai2018-08-201-24/+28
| | | | | The fuzzer found a bug with flowing of values in that pass: when one arm of an if is none-typed, we can't flow a value through the other. Odd the fuzzer didn't find this earlier, as it's been a bug since the pass was written years ago, but in practice it seems you need a specific set of circumstances on the outside for it to be hit. The fix is to stop flowing a value in that case. Also, I realized after fixing it that the valueCanFlow global state variable is entirely unneeded. Removing it makes the pass significantly simpler: at all times, flows contains branches and values that might be flowing, and if the flow stops we remove them, etc. - we don't need an extra state variable to say if flowing is possible. So when we want to use the flows, we just check what is there (and then for a flowing branch we can remove it, and for a flowing value we can replace the branch with the value, etc., as in both cases they flow to the right place anyhow).
* switch from CMAKE_SOURCE_DIR to PROJECT_SOURCE_DIR to support ↵Jay Phelps2018-08-171-2/+2
| | | | add_subdirectory(binaryen) (#1637)
* Print Stack IR in proper .wat format (#1630)Alon Zakai2018-08-147-420/+618
| | | This now makes --generate-stack-ir --print-stack-ir emit a fully valid .wat wasm file, in stacky format.
* wasm-ctor-eval improvements (#1631)Alon Zakai2018-08-071-0/+4
| | | | | * When we eval a ctor, don't just nop the function body that no longer needs to be executed, also remove the export (as we report the ctor being evalled, and the outside will no longer call it). * Run the pass to remove unused global things. This can usually remove evalled ctors (unless something else happens to call them, which can't happen normally as LLVM wouldn't use a ctor in another place, but e.g. duplicate function merging might merge a ctor with another function).
* Stack IR (#1623)Alon Zakai2018-07-3020-973/+1988
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format. This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented, DCE local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get. Block removal - remove any blocks with no branches, as they are valid in wasm binary format. Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR. Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes: Generate stack IR Optimize stack IR (might be worth splitting out into separate passes eventually) Print stack IR for debugging purposes Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR. Miscellaneous notes: Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways. Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present. A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above. The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed. Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid. Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
* Fix source map entries offset when LEB is compressed. (#1628)Yury Delendik2018-07-252-16/+42
|
* Notice parse errors on number parsing in the text format (#1608)Loppin Vincent2018-07-241-0/+6
| | | | | | | * - Throw ParseException when istringstream failed to read a number. - Modify now invalid tests. * Add invalid_number.wast test
* Clarify what function-parallel passes can do, and fix an asm2wasm bug (#1627)Alon Zakai2018-07-233-13/+29
| | | | | The problem this fixes is that we made precompute look at globals in #1622, while asm2wasm was creating globals while adding functions and optimizing them - which could race. This was caught by threadSanitizer (with low frequency, so we missed it on the initial landing). The underlying issue is that function-parallel passes should be able to read global state, just not modify it, and not read other functions' contents (which is why the Call node has a name, not a pointer to a function). This PR clarifies that in the docs, and fixes asm2wasm by not handling function bodies in parallel to creating globals.
* Some minor LocalGraph improvements (#1625)Alon Zakai2018-07-211-80/+57
| | | | | * Remove the Action class - we just need a pointer to a get or set. This simplifies the code and saves a little memory, but doesn't seem to have any impact on speed. * Miscellaneous code style and comment changes.
* Mark arguments const in callExport (#1626)Alex Beregszaszi2018-07-211-6/+5
| | | The arguments is read only and therefore could be const. The immediate benefit is callers do not need to define it as a local variable (see Literal callExport(Name name)).
* Speedup localgraph (#1610)Loppin Vincent2018-07-201-18/+64
| | | | | | | | | | * LocalGraph : Replace seen unordered_set by boolean check. * LocalGraph : use unordered_map to store index -> last set_local instead of vector. * LocalGraph : - Use internal counter to avoid invalidation at each cycle. - Move all blocks structs into a contiguous vector of smaller ones.
* Support constant globals in precompute pass (#1622)Daniel Wirtz2018-07-182-24/+32
| | | | | | | | | This PR includes non-mutable globals in precompute, which will allow me to continue removing manual inlining of constants in AssemblyScript without breaking something. Related: #1621, i.e. enum Animal { CAT = 0, DOG = CAT + 1 // requires that `Animal.CAT` is evaluated to // precompute the constant value for `Animal.DOG` }
* Refactor stack writing code into a new StackWriter class (#1620)Alon Zakai2018-07-162-227/+247
| | | | | | | This separates out the WasmBinaryWriter parts that do stack writing into a separate class, StackWriter. Previously the WasmBinaryWriter did both the general writing and the stack stuff, and the stack stuff has global state, which it manually cleaned up etc. - seems nicer to have it as a separate class, a class focused on just that one thing. Should be no functional changes in this PR. Also add a timeout to the wasm-reduce test, which happened to fail on one of the commits here. It was running slower on that commit for some reason, could have been random - I verified that general wasm writing speed is unaffected by this PR. (But I added the timeout to prevent future random timeouts.)
* Minor code cleanups (#1617)Alon Zakai2018-07-104-125/+130
| | | | | | * code cleanups in wasm-binary: remove an & param, and standardize whitespace * add some docs for how the relooper handles blocks with no outgoing branches [ci skip]
* Proper error handling in add* and get* methods (#1570)Alon Zakai2018-07-103-32/+97
| | | | | | | See #1479 (comment) Also a one-line readme update, remove an obsolete compiler (mir2wasm) and add a new one (asterius). Also improve warning and error reporting in binaryen.js - show a stack trace when relevant (instead of node.js process.exit), and avoid atexit warning spam in debug builds.
* emscripten no longer allows modifying Module['print'] at runtime. Modify the ↵Alon Zakai2018-07-031-9/+9
| | | | internal out() method instead. see kripken/emscripten#6756 (#1614)
* Remove s2wasm (#1607)Sam Clegg2018-06-287-2535/+12
| | | | s2wasm is no longer used my emscripten and as far as I know now as no other users.
* Improve source map parsing to handle whitespace (#1598)Sam Clegg2018-06-131-14/+34
|
* Add source map handling to wasm-emscripten-finalize (#1595)Sam Clegg2018-06-102-5/+25
|
* -O4: When -O3 isn't enough (#1596)Alon Zakai2018-06-083-3/+17
| | | | | | | | | This defines a new -O4 optimization mode, as flatten + flat-only opts (currently local-cse) + -O3. In practice, flattening is not needed for LLVM output, which is pretty flat already (no block or if values, etc., even if it does use tees and does nest expressions; and LLVM has already done gvn etc. anyhow). In general, though, wasm generated by a non-LLVM compiler may naturally be nested because wasm allows that. See for example #1593 where an AssemblyScript testcase requires flattening to be fully optimized. So -O4 can help there. -O4 takes 3x longer to run than -O3 in my testing, basically because flat IR is much bigger. But when it's useful it may be worth it. It does handle that AssemblyScript testcase and others like it. There's not much big real-world code that isn't LLVM yet, but running the fuzzer - which happily creates nested stuff all the time - I see -O4 consistently shrink the size by around 20% over -O3.
* Improve local-cse (#1594)Alon Zakai2018-06-084-58/+96
| | | | | This makes it much more effective, by rewriting it to depend on flatten. In flattened IR, it is very simple to check if an expression is equivalent to one already available for use in a local, and use that one instead, basically we just track values in locals. Helps with #1521
* wasm-opt source map support (#1557)Alon Zakai2018-06-075-39/+51
| | | | | | | | | | * support source map input in wasm-opt, refactoring the loading code into wasm-io * use wasm-io in wasm-as * support output source maps in wasm-opt * add a test for wasm-opt and source maps
* duplicate-function-elimination improvements (#1590)Alon Zakai2018-06-076-59/+128
| | | | | | | On a codebase with 370K functions, 160K were in fact duplicate (!)... and it took many many passes to figure that out, over 2 minutes in fact (!), as A and B may be identical only after we see that the functions C1, C2 that they call are identical (so there can be long "chains" here). To avoid this, limit how many passes we do. In -O1, just do one pass - that gets most duplicates. In -O2, do 10 passes - that gets almost all of it on this codebase. And in -O3 (or -Os/-Oz) do as many passes as necessary (i.e., the old behavior). This at least lets iteration builds (-O1) be nice and fast. This PR also refactors the hashing code used in that pass, moving it to nicer header files for clearer readability. Also some other minor cleanups in hashing code that helped debug this.
* Handle parse errors in wasm-emscripten-finalize (#1589)Sam Clegg2018-06-061-1/+7
|
* Fix check in fixInvokeFunctionNames (#1588)Sam Clegg2018-06-061-1/+1
| | | | This check is supposed to check if rename is needed so it need to compare to the original.
* Ensure import and function names match during fixInvokeFunctionNames (#1587)Sam Clegg2018-06-051-1/+5
| | | | | | We ran into an issue recently where wasm-emscripten-finalize was being passed input without any debug names and this is not currently supported.
* run precompute-propagate early, when we would run it also late, as it is ↵Alon Zakai2018-06-041-2/+7
| | | | helpful in both positions on general code (#1581)
* Add -g/--debuginfo flag to wasm-emscripten-finalize (#1584)Sam Clegg2018-06-041-3/+9
| | | | | This brings this tool into parity with the existing s2wasm
* Always incorporate the table segment offset when calculating ↵Jacob Gravelle2018-06-011-4/+4
| | | | jsCallStartIndex (#1579)
* Optimize validation of many nested blocks (#1576)Alon Zakai2018-05-302-46/+50
| | | | | | | On the testcase from https://github.com/tweag/asterius/issues/19#issuecomment-393052653 this makes us almost 3x faster, and use 25% less memory. The main improvement here is to simplify and optimize the data structures the validator uses to validate br targets: use unordered maps, and use one less of them. Also some speedups from using that map more effectively (use of iterators to avoid multiple lookups). Also move the duplicate-node checks to the internal IR validation section, which makes more sense anyhow (it's not wasm validation, it's internal IR validation, which like the check for stale internal types, we do only if debugging).
* wasm2asm: Fix and enable a large number of spec tests (#1558)Alex Crichton2018-05-295-206/+439
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Import `abort` from the environment * Add passing spec tests * Bind the abort function * wasm2asm: Fix name collisions Currently function names and local names can collide in namespaces, causing buggy results when a function intends to call another function but ends up using a local value as the target! This fix was required to enable the `fac` spec test * wasm2asm: Get multiple modules in one file working The spec tests seem to have multiple modules defined in some tests and the invocations all use the most recently defined module. This commit updates the `--allow-asserts` mode of wasm2asm to work with this mode of tests, enabling us to enable more spec tests for wasm2asm. * wasm2asm: Enable the float_literals spec test This needed to be modified to account for how JS engines don't work with NaN bits the same way, but it's otherwise largely the same test. Additionally it turns out that asm.js doesn't accept either `Infinity` or `NaN` ambient globals so they needed to get imported through the `global` variable rather than defined as literals in code * wasm2asm: Fix function pointer invocations This commit fixes invocations of functions through function pointers as previously the table names on lookup and definition were mismatched. Both tables now go through signature-based namification rather than athe name of the type itself. Overall this enables a slew of spec tests * wasm2asm: Enable the left-to-right spec test There were two small bugs in the order of evaluation of operators with wasm2asm. The `select` instruction would sometimes evaluate the condition first when it was supposed to be last. Similarly a `call_indirect` instruction would evaluate the function pointer first when it was supposed to be evaluated last. The `select` instruction case was a relatively small fix but the one for `call_indirect` was a bit more pessimized to generate some temporaries. Hopefully if this becomes up a problem it can be tightened up. * wasm2asm: Fix signed load promotions of 64-bit ints This commit enables the `endianness` spec test which revealed a bug in 64-bit loads from smaller sizes which were signed. Previously the upper bits of the 64-bit number were all set to zero but the fix was for signed loads to have all the upper bits match the highest bit of the low 32 bits that we load. * wasm2asm: Enable the `stack` spec test Internally the spec test uses a mixture of the s-expression syntax and the wat syntax, so this is copied over into the `wasm2asm` folder after going through `wat2wasm` to ensure it's consistent for binaryen. * wasm2asm: Fix unaligned loads/stores of floats Replace these operations in `RemoveNonJSOps` by using reinterpretation to translate floats to integers and then use the existing code for unaligned loads/stores of integers. * wasm2asm: Fix a tricky grow_memory codegen bug This commit fixes a tricky codegen bug found in the `grow_memory` instruction. Specifically if you stored the result of `grow_memory` immediately into memory it would look like: HEAP32[..] = __wasm_grow_memory(..); Here though it looks like JS evaluates the destination *before* the grow function is called, but the grow function will invalidate the destination! Furthermore this is actually generalizable to all function calls: HEAP32[..] = foo(..); Because any function could transitively call `grow_memory`. This commit fixes the issue by ensuring that store instructions are always considered statements, unconditionally evaluating the value into a temporary and then storing that into the destination. While a bit of a pessmimization for now it should hopefully fix the bug here. * wasm2asm: Handle offsets in tables This commit fixes initializing tables whose elements have an initial offset. This should hopefully help fix some more Rust code which has all function pointers offset by default! * Update tests * Tweak * location on types * Rename entries of NameScope and document fromName * Comment on lowercase names * Update compiled JS * Update js test output expectation * Rename NameScope::Global to NameScope::Top * Switch to `enum class` * Switch to `Fatal()` * Add TODO for when asm.js is no longer generated
* allow --total-memory to be greater than a signed int32 (#1565)Alon Zakai2018-05-261-1/+1
|
* Fix embedwast.py for out-of-tree building (#1569)Sam Clegg2018-05-251-2/+2
|
* wasm2asm: Finish i64 lowering operations (#1563)Alex Crichton2018-05-259-685/+1307
| | | | | | | | | | | | | | | | | * wasm2asm: Finish i64 lowering operations This commit finishes out lowering i64 operations to JS with implementations of division and remainder for JS. The primary change here is to have these compiled from Rust to wasm and then have them "linked in" via intrinsics. The `RemoveNonJSOps` pass has been updated to include some of what `I64ToI32Lowering` was previously doing, basically replacing some instructions with calls to intrinsics. The intrinsics are now all tracked in one location. Hopefully the intrinsics don't need to be regenerated too much, but for posterity the source currently [lives in a gist][gist], although I suspect that gist won't continue to compile and work as-is for all of time. [gist]: https://gist.github.com/alexcrichton/e7ea67bcdd17ce4b6254e66f77165690
* wasm2asm: Finish f32/f64 operations (#1554)Alex Crichton2018-05-198-304/+521
|
* Fix optimizing equivalent locals bug introduced in #1540 (#1556)Alon Zakai2018-05-171-4/+2
| | | Don't skip through flowing tee values, just drop the current outermost which we find is redundant. the child tees may still be necessary.
* wasm2asm: Implement float<->int conversions (#1550)Alex Crichton2018-05-166-58/+472
| | | | | | | | | This commit lifts the same conversion strategy that `emcc` takes to convert between floats point numbers and integers, and it should implement all the various matrices of i32/u32/i64/u64 to f32/f64 Some refactoring was performed in the i64->i32 pass to allow for temporary variables to get allocated which have types other than i32, but otherwise this contains a pretty direct translation of `emcc`'s operations to `wasm2asm`.
* Clean up printing code (#1548)Alon Zakai2018-05-153-36/+34
| | | | | * make the iostream overrides receive a reference, not a pointer (i.e., like e.g. LLVM IR printing works, and avoiding overriding printing of pointer addresses which is sort of odd) * move more code out of headers, especially unrelated headers.
* wasm-emscripten: Don't use debug names in implementedFunctions (#1537)Sam Clegg2018-05-151-2/+4
| | | | | | | | | | | | | | implementFunctions should use the export names, not the internal/debug name for a function. This is especially imported with lld where the debug names are demanagled. implementFunctions should only contain functions that are accessible from outside the module. i.e. those that have been exported. There is no point in adding internal-only functions to this list as they won't be accessible from outside anyway. Tesed with emscripten using: ./tests/runner.py binaryen2.test_time
* wasm2asm: Implement f32/f64.copysign (#1551)Alex Crichton2018-05-155-0/+110
| | | | | | This commit implements the `copysign` instruction for the wasm2asm binary. The implementation here is a new pass which wholesale replaces `copysign` instructions with the equivalent bit ops and reinterpretation instructions. It's intended that this matches Emscripten's implementation of lowering here.
* In full-printing mode, print comments for control flow endings, to help ↵Alon Zakai2018-05-141-0/+22
| | | | | | | | | | | readability (#1552) Like this: (block $x .. ) ;; end block $x Also fix some current breakage on master.
* wasm2asm: Add math aliases for floor, ceil and sqrt (#1549)Daniel Wirtz2018-05-141-0/+3
|
* Implement 64-bit rotation lowering for wasm2asm (#1545)Alex Crichton2018-05-141-2/+193
| | | | Not much fancy here, but rather each operation is naively lowered inline to the if/else chain to execute it.
* wasm2asm: Implement reinterpretation instructions (#1547)Alex Crichton2018-05-132-3/+58
| | | | | | | | | | | | | As mentioned in #1458 a naive implementation of these instructions is to round trip the value through address 0 in linear memory. Also pointed out in #1458 this isn't necessarily valid for all languages. For now, though, languages like Rust, C, and C++ would likely be horribly broken if valid data could be stored at low addresses, so this commit goes ahead and adds an implementation of the reinterpretation instructions by traveling data through address 0. This will likely need an update if a language comes a long which can validly store data in the first 8 bytes of linear memory, but it seems like that won't happen in the near future. Closes #1458