summaryrefslogtreecommitdiff
path: root/src/passes/pass.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove support for emscripten legacy PIC ABI (#3299)Sam Clegg2020-10-291-6/+0
|
* Rewrite DCE pass (#3274)Alon Zakai2020-10-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The DCE pass is one of the oldest in binaryen, and had quite a lot of cruft from the changes in unreachability and other stuff in wasm and binaryen's history. This PR rewrites it from scratch, making it about 1/3 the size. I noticed this when looking for places to use code autogeneration. The old version had annoying boilerplate, while the new one avoids any need for it. There may be noticeable differences, as the old pass did more than it needed to. It overlapped with remove-unused-names for some reason I don't remember. The new pass leaves that to the other pass to do. I added another run of remove-unused-names to avoid noticeable differences in optimized builds, but you can see differences in the testcases that only run DCE by itself. (The test differences in this PR are mostly whitespace.) (The overlap is that if a block ended up not needed, that is, all branches to it were removed, the old DCE would remove the block.) This pass is about 15% faster than the old version. However, when adding another run of remove-unused-names the difference basically vanishes, so this isn't a speedup.
* Remove old/non-working SpillPointers pass (#3261)Sam Clegg2020-10-201-3/+0
| | | | | | | | | And associated stack.h. The current stack.h clearly doesn't work with the llvm back as it assumes the stack grows up, which means non of these has been working or used in a long time. Rather than trying to fix this unused features its probably cleaner to just remove it for now and restore it rom git history if its someone that anyone actually wants to use in the future.
* Remove now-redundant stack pointer manipulation passes (#3251)Sam Clegg2020-10-181-4/+0
| | | | The use of these passes was removed on the emscripten side in https://github.com/emscripten-core/emscripten/pull/12536.
* Log nested pass names in BINARYEN_PASS_DEBUG=2 (#3214)Alon Zakai2020-10-151-9/+15
| | | | We can't validate or print out the wasm in that case, but at least logging the names as they run can help debug some situations.
* Added Initial Memory64Lowering pass (#3230)Wouter van Oortmerssen2020-10-131-0/+4
| | | | This pass will convert a module with 64-bit loads and stores accessing a 64-bit memory to a regular 32-bit one. Pointers remain 64-bit but are truncated just before use.
* Remove RelooperJumpThreading pass, which was just for fastcomp (#3199)Alon Zakai2020-10-081-3/+0
| | | See emscripten-core/emscripten#11860
* Improve testing on Windows (#3142)Wouter van Oortmerssen2020-09-171-10/+8
| | | | | | This PR contains: - Changes that enable/disable tests on Windows to allow for better local testing. - Also changes many abort() into Fatal() when it is really just exiting on error. This is because abort() generates a dialog window on Windows which is not great in automated scripts. - Improvements to CMake to better work with the project in IDEs (VS).
* wasm-emscripten-finalize: Add flags to limit dynCall creation (#3070)Sam Clegg2020-08-261-0/+5
| | | | | | Two new flags here, one to completely removes dynCalls, and another to limit them to only signatures that contains i64. See #3043
* Refactor hashing (#3023)Daniel Wirtz2020-08-121-1/+1
| | | | | | | | | * Unifies internal hashing helpers to naturally integrate with std::hash * Removes the previous custom implementation * Computed hashes are now always size_t * Introduces a hash_combine helper * Fixes an overwritten partial hash in Relooper.cpp
* Add StubUnsupportedJSOps to remove operations that JS does not support (#3024)Alon Zakai2020-08-051-0/+3
| | | | | | | | This doesn't lower them - it just replaces the unsupported operation with a drop. This will be useful for fuzzing, where to compare JS to the correct semantics we must avoid operations where JS is not always accurate. Also fully document the i64 -> f32 conversion issue in JS.
* Move generateDynCallThunks into its own pass. NFC. (#3000)Sam Clegg2020-08-041-0/+3
| | | | | | The core logic is still living in EmscriptenGlueGenerator because its used also by fixInvokeFunctionNames. As a followup we can figure out how to make these more independent.
* New Dealign pass: reduce load/store alignment to 1 (#3010)Alon Zakai2020-07-311-0/+3
| | | | | Pretty trivial, but will be useful in wasm2js testing, where we can't assume an incorrectly-aligned load/store will still work, so we'll need to be pessimistic about alignment there.
* Move stack-check into its own pass (#2994)Sam Clegg2020-07-271-0/+3
| | | | | This new pass takes an optional stack-check-handler argument which is the name of the function to call on stack overflow. If no argument is passed then it just traps.
* Move emscripten PIC ABI conversion to a pass. NFC. (#2985)Sam Clegg2020-07-241-0/+6
| | | | Doing it this way happens to re-order the __assign_got_entries function in the module, but its otherwise NFC.
* Move ReplaceStackPoint into a pass (#2984)Sam Clegg2020-07-241-0/+4
| | | First step in making wasm-emscripten-finalize use more passes.
* DeNaN pass (#2877)Alon Zakai2020-05-271-0/+3
| | | | | | This moves the fuzzer de-NaN logic out into a separate pass. This is cleaner and also better since the old way would de-NaN once, but then the reducer could generate code with nans. The new way lets us de-NaN while reducing.
* Remove redundant vacume pass. Followup on #2741 (#2747)Sam Clegg2020-04-101-1/+0
| | | | Based on freedback in #2741 it looks like we can use the existing `simplify-globals-optimizing` pass to trigger this cleanups we need.
* Remove writes to globals that are never written to (#2741)Sam Clegg2020-04-091-0/+1
| | | | | Since the global is never read, we know that any write operation will be unobservable.
* DWARF: Disable optimization passes not fully compatible with DWARF yet (#2640)Alon Zakai2020-02-061-11/+42
| | | | | | | | | | | Anything that merges/swaps/etc. locals, or inlines, or merges functions, must be disabled for now. However, that does still leave almost all passes, so this should not affect output sizes much (and the full LLVM optimizer can be run before too). Over time we can resolve each of those FIXMEs. The test output here shows how disabling those allows over twice as much debug_line info to be preserved.
* Optionally minify imported module names (#2620)Alon Zakai2020-01-271-0/+4
| | | | | | | | | | | | | | | | This replaces imports like env.foo with a.foo, which can save a bunch of bytes when there are many imported functions. Note that by changing all the import names to a it ends up requiring a single merged import module. Note also that when doing this we modify all the imports, minifying their modules and names (since it makes no sense to be careful about minifying only modules known to us - env/wasi - if we are minifyin the names of all modules). This will require an emscripten PR to benefit from it.
* Optimize passive segments in memory-packing (#2426)Thomas Lively2020-01-151-1/+1
| | | | | | | | | When memory is packed and there are passive segments, bulk memory operations that reference those segments by index need to be updated to reflect the new indices and possibly split into multiple instructions that reference multiple split segments. For some bulk-memory operations, it is necessary to introduce new globals to explicitly track the drop state of the original segments, but this PR is careful to only add globals where necessary.
* DWARF debug line updating (#2545)Alon Zakai2019-12-201-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, we can update DWARF debug line info properly as we write a new binary. To do that we track binary locations as we write. Each instruction is mapped to the location it is written to. We must also adjust them as we move code around because of LEB optimization (we emit a function or a section with a 5-byte LEB placeholder, the maximal size; later we shrink it which is almost always possible). writeDWARFSections() now takes a second param, the new locations of instructions. It then maps debug line info from the original offsets in the binary to the new offsets in the binary being written. The core logic for updating the debug line section is in wasm-debug.cpp. It basically tracks state machine logic both to read the existing debug lines and to emit the new ones. I couldn't find a way to reuse LLVM code for this, but reading LLVM's code was very useful here. A final tricky thing we need to do is to update the DWARF section's internal size annotation. The LLVM YAML writing code doesn't do that for us. Luckily it's pretty easy, in fixEmittedSection we just update the first 4 bytes in place to have the section size, after we've emitted it and know the size. This ignores debug lines with a 0 in the line, col, or addr, see WebAssembly/debugging#9 (comment) This ignores debug line offsets into the middle of instructions, which LLVM sometimes emits for some reason, see WebAssembly/debugging#9 (comment) Handling that would likely at least double our memory usage, which is unfortunate - we are run in an LTO manner, where the entire app's DWARF is present, and it may be massive. I think we should see if such odd offsets are a bug in LLVM, and if we can fix or prevent that. This does not emit "special" opcodes for debug lines. Those are purely an optimization, which I wanted to leave for later. (Even without them we decrease the size quite a lot, btw, as many lines have 0s in them...) This adds some testing that shows we can load and save fib2.c and fannkuch.cpp properly. The latter includes more than one function and has nontrivial code. To actually emit correct offsets a few minor fixes are done here: * Fix the code section location tracking during reading - the correct offset we care about is the body of the code section, not including the section declaration and size. * Fix wasm-stack debug line emitting. We need to update in BinaryInstWriter::visit(), that is, right before writing bytes for the instruction. That differs from * BinaryenIRWriter::visit which is a recursive function that also calls the children - so the offset there would be of the first child. For some reason that is correct with source maps, I don't understand why, but it's wrong for DWARF... * Print code section offsets in hex, to match other tools. Remove DWARFUpdate pass, which was useful for testing temporarily, but doesn't make sense now (it just updates without writing a binary). cc @yurydelendik
* DWARF parsing and writing support using LLVM (#2520)Alon Zakai2019-12-191-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This imports LLVM code for DWARF handling. That code has the Apache 2 license like us. It's also the same code used to emit DWARF in the common toolchain, so it seems like a safe choice. This adds two passes: --dwarfdump which runs the same code LLVM runs for llvm-dwarfdump. This shows we can parse it ok, and will be useful for debugging. And --dwarfupdate writes out the DWARF sections (unchanged from what we read, so it just roundtrips - for updating we need #2515). This puts LLVM in thirdparty which is added here. All the LLVM code is behind USE_LLVM_DWARF, which is on by default, but off in JS for now, as it increases code size by 20%. This current approach imports the LLVM files directly. This is not how they are intended to be used, so it required a bunch of local changes - more than I expected actually, for the platform-specific stuff. For now this seems to work, so it may be good enough, but in the long term we may want to switch to linking against libllvm. A downside to doing that is that binaryen users would need to have an LLVM build, and even in the waterfall builds we'd have a problem - while we ship LLVM there anyhow, we constantly update it, which means that binaryen would need to be on latest llvm all the time too (which otherwise, given DWARF is quite stable, we might not need to constantly update). An even larger issue is that as I did this work I learned about how DWARF works in LLVM, and while the reading code is easy to reuse, the writing code is trickier. The main code path is heavily integrated with the MC layer, which we don't have - we might want to create a "fake MC layer" for that, but it sounds hard. Instead, there is the YAML path which is used mostly for testing, and which can convert DWARF to and from YAML and from binary. Using the non-YAML parts there, we can convert binary DWARF to the YAML layer's nice Info data, then convert that to binary. This works, however, this is not the path LLVM uses normally, and it supports only some basic DWARF sections - I had to add ranges support, in fact. So if we need more complex things, we may end up needing to use the MC layer approach, or consider some other DWARF library. However, hopefully that should not affect the core binaryen code which just calls a library for DWARF stuff. Helps #2400
* Write wasm/wast files with BINARYEN_PASS_DEBUG=3 (#2527)Heejin Ahn2019-12-131-3/+3
| | | | | Currently `BINARYEN_PASS_DEBUG=3` prints `.wasm` files but they are actually text wast files. This makes `BINARYEN_PASS_DEBUG=3` prints both wasm/wast files, where wasm contains a binary file and wast a text file.
* Add a RoundTrip pass (#2516)Alon Zakai2019-12-091-0/+3
| | | | | | This pass writes and reads the module. This shows the effects of converting to and back from the binary format, and will be useful in testing dwarf debug support (where we'll need to see that writing and reading a module preserves debug info properly).
* Use wat over wast for text format filenames (#2518)Sam Clegg2019-12-081-1/+1
|
* Add a pass to inline __original_main() into main() (#2461)Alon Zakai2019-11-211-0/+2
| | | | | | | | | | | | | | | | | | clang/llvm introduce __original_main as a workaround for the fact that main may have different signatures. A downside to that is that users get it in stack traces, which is confusing. In -O2 and above we normally inline __original_main anyhow, but as this is for debugging, non-optimized builds matter too, so add a pass for this. The implementation is trivial, just call doInling. However we must check some corner cases first. Bonus minor fixes to FindAllPointers, which unnecessarily created an object to get the class Id (which is not valid for all classes), and that it didn't take the input by reference properly, which meant we couldn't get the pointer to the function body's toplevel.
* Add a --strip-dwarf pass (#2454)Alon Zakai2019-11-191-0/+1
| | | | | | | | | | | | | This pass strips DWARF debug sections, but not other debug sections. This is useful when emitting source maps, as we do need the SourceMapURL section, but the DWARF sections are not longer necessary (and we've seen a testcase where they are massively large, so big the wasm can't even be loaded in a browser...). Also contains a trivial one-line fix in --extract-function which was necessary to create the testcase here: that pass extracts a function from a wasm file (like llvm-extract) but it didn't check if an export already existed for the function.
* Add PostAssemblyScript pass (#2407)Daniel Wirtz2019-11-191-0/+6
| | | | | Adds the AssemblyScript-specific passes post-assemblyscript and post-assemblyscript-finalize, eliminating redundant ARC-style retain/release patterns conservatively emitted by the compiler.
* Add ModAsyncify* passes (#2404)Alon Zakai2019-10-231-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | These passes are meant to be run after Asyncify has been run, they modify the output. We can assume that we will always unwind if we reach an import, or that we will never unwind, etc. This is meant to help with lazy code loading, that is, the ability for an initially-downloaded wasm to not contain all the code, and if code not present there is called, we download all the rest and continue with that. That could work something like this: * The wasm is created. It contains calls to a special import for lazy code loading. * Asyncify is run on it. * The initially downloaded wasm is created by running --mod-asyncify-always-and-only-unwind: if the special import for lazy code loading is called, we will definitely unwind, and we won't rewind in this binary. * The lazily downloaded wasm is created by running --mod-asyncify-never-unwind: we will rewind into this binary, but no longer need support for unwinding. (Optionally, there could also be a third wasm, which has not had Asyncify run on it, and which we'd swap to for max speed.) These --mod-asyncify passes allow the optimizer to do a lot of work, especially for the initially downloaded wasm if we have lots of calls to the lazy code loading import. In that case the optimizer will see that those calls unwind, which means the code after them is not reached, potentially making lots of code dead and removable.
* SimplifyGlobals: Apply known constant values in linear traces (#2340)Alon Zakai2019-09-131-1/+9
| | | | | | | | | | | | | This optimizes stuff like (global.set $x (i32.const 123)) (global.get $x) into (global.set $x (i32.const 123)) (i32.const 123) This doesn't help much with LLVM output as it's rare to use globals (except for the stack pointer, and that's already well optimized), but it may help on general wasm. It can also help with Asyncify that does use globals extensively.
* Duplicate Import Elimination (#2292)Alon Zakai2019-08-091-0/+4
| | | | | | | | | | | | | This is both an optimization and a workaround for the problem that emscripten-core/emscripten#7641 uncovered and had to be reverted because of. What's going on there is that wasm-emscripten-finalize turns emscripten_longjmp_jmpbuf into emscripten_longjmp (for some LLVM internal reason - there's a long comment in the source that I didn't fully follow). There are two such imports already, one for each name, and before that PR, we ended up with just one. After that PR, we end up with two. And with two, the minification of import names gets confused - we have two imports with the same name, and the code there ends up ignoring one of them. I'm not sure why that PR changed things - I guess the wasm-emscripten-finalize code looks at the name, and that PR changed what name appears? @sbc100 maybe #2285 is related? Anyhow, it's not trivial to make import minification code support two identical imports, but I don't think we should - we should avoid having such duplication anyhow. And we should add an assert that they don't exist (I'll open a PR for that later when it's possible). This fixes the duplication by adding a useful pass to remove duplicate imports (just functions, for now). Pretty simple, but we didn't do it yet. Even if there is a wasm-emscripten-finalize bug we need to fix with those duplicate imports, I think this pass is still a good thing to add. I confirmed that this fixes the issue caused by that PR.
* Simpify PassRunner.add() and automatically parallelize parallel functions ↵Alon Zakai2019-07-191-19/+14
| | | | | | | | | (#2242) Main change here is in pass.h, everything else is changes to work with the new API. The add("name") remains as before, while the weird variadic add(..) which constructed the pass now just gets a std::unique_ptr of a pass. This also makes the memory management internally fully automatic. And it makes it trivial to parallelize WalkerPass::run on parallel passes. As a benefit, this allows removing a lot of code since in many cases there is no need to create a new pass runner, and running a pass can be just a single line.
* Cleanups after renaming Bysyncify to Asyncify (#2228)Alon Zakai2019-07-161-2/+0
| | | | | * Clarify the difference between old and new Asyncify. * Remove the old --bysyncify pass option.
* Bysyncify => Asyncify (#2226)Alon Zakai2019-07-151-3/+5
| | | | | | | After some discussion this seems like a less confusing name: what the pass does is "asyncify" code, after all. The one downside is the name overlaps with the old emscripten "Asyncify" utility, which we'll need to clarify in the docs there. This keeps the old --bysyncify flag around for now, which is helpful for avoiding temporary breakage on CI as we move the emscripten side as well.
* Clean up loose ends in feature handling (#2203)Thomas Lively2019-07-031-0/+3
| | | | | Fix and test mutable globals support, replace string literals with constants, and add a pass to emit the target features section.
* Bysyncify: async transform for wasm (#2172)Alon Zakai2019-06-151-0/+3
| | | | | | | | | This adds a new pass, Bysyncify, which transforms code to allow unwind and rewinding the call stack and local state. This allows things like coroutines, turning synchronous code asynchronous, etc. The new pass file itself has a large comment on top with docs. So far the tests here seem to show this works, but this hasn't been tested heavily yet. My next step is to hook this up to emscripten as a replacement for asyncify/emterpreter, see emscripten-core/emscripten#8561 Note that this is completely usable by itself, so it could be useful for any language that needs coroutines etc., and not just ones using LLVM and/or emscripten. See docs on the ABI in the pass source.
* Add --print-function-map to print out a map of function index to name (#2155)Alon Zakai2019-05-311-0/+3
| | | | | | | | | | * work * fix * fix * format
* Allow color API to enable and disable colors (#2111)Siddharth2019-05-171-1/+1
| | | | | | This is useful for front-ends which wish to selectively enable or disable coloring. Also expose these APIs from the C API.
* wasm2js: avoid reinterprets (#2094)Alon Zakai2019-05-101-2/+5
| | | | | In JS a reinterpret is especially expensive, as we implement it as a write to a temp buffer and a read using another view. This finds places where we load a value from memory, then reinterpret it later - in that case, we can load it using another view, at the cost of another load and another local. This is helpful on things like Box2D, where there are many reinterprets due to the main 2D vector class being an union over two floats/ints, and LLVM likes to do a single i64 load of them.
* Emit process ID in the filenames of byn* tempfiles (#1916)Alon Zakai2019-05-101-1/+10
| | | Helps to avoid trampling each other when binaryen is called multiple times from emcc, for example.
* Optimize mutable globals (#2066)Alon Zakai2019-05-021-0/+4
| | | | | | | If a global is marked mutable but not assigned to, make it immutable. If an immutable global is a copy of another, use the original, so we can remove the duplicates. Fixes #2011
* Add a pass to lower unaligned loads and stores (#2078)Alon Zakai2019-05-021-0/+3
| | | | | This replaces the wasm2js code that lowered them to pessimistic (1-byte aligned) loads and stores. The new pass will do the optimal thing, keeping 2-byte alignment where possible. This is also nicer as a standalone pass, which has the simple property that after it runs all loads and stores are aligned, instead of some code scattered inside wasm2js.
* clang-tidy braces changes (#2075)Alon Zakai2019-05-011-1/+2
| | | Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
* Apply format changes from #2048 (#2059)Alon Zakai2019-04-261-112/+279
| | | Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
* Change default feature set to MVP (#1993)Thomas Lively2019-04-161-0/+1
| | | | | In the absence of the target features section or command line flags. When there are command line flags, it is an error if they do not exactly match the target features section, except if --detect-features has been provided. Also adds a --print-features pass to print the command line flags for all enabled options and uses it to make the feature tests more rigorous.
* Move features from passOptions to Module (#2001)Thomas Lively2019-04-121-2/+2
| | | | | This allows us to emit a (potentially modified) target features section and conditionally emit other sections such as the DataCount section based on the presence of features.
* Move segment merging to fit web limits into its own pass (#1980)Thomas Lively2019-04-081-0/+1
| | | | | | It was previously part of writing a binary, but changing the number of segments at such a late stage would not work in the presence of bulk memory's datacount section. Also updates the memory packing pass to respect the web's limits on the number of data segments.
* better location for running directizeAlon Zakai2019-04-021-1/+1
|