summaryrefslogtreecommitdiff
path: root/test/passes
Commit message (Collapse)AuthorAgeFilesLines
* Improve inlining "heavyweight" (#3085)Max Graey2020-09-044-93/+175
| | | | | | | | | Split that mode into an option to check for loops (which indicate a function is "heavy") and a constant check for having calls. The case of calls is different as we would need more logic to avoid infinite recursion if we are willing to inling functions with calls. Practically, this renames allowHeavyweight to allowFunctionsWithLoops.
* Remove old stack function from StackCheck (#3100)Alon Zakai2020-09-031-7/+0
|
* MinifyImportsAndExports: Minify the memory and table as well. (#3089)Alon Zakai2020-09-023-20023/+20029
| | | | | | | | | | | | We were careful not to minify those, as well as the stack pointer, which makes sense in dynamic linking. But we don't run this pass in dynamic linking anyhow - we need the proper names of symbols in that case. So this was not helping us, and was just a leftover from an early state. This both a useful optimization and also important for #3043, as the wasm backend exports the table as __indirect_function_table - a much longer name than emscripten's table. So just changing to that would regress code size on small projects. Once we land this, the name won't matter as it will be minified anyhow.
* StackCheck: Check both under and overflow (#3091)Alon Zakai2020-09-022-0/+55
| | | | | | | | | | | | | | | | | | | See emscripten-core/emscripten#9039 (comment) The valid stack area is a region [A, B] in memory. Previously we just checked that new stack positions S were S >= A, which prevented us from growing too much (the stack grows down). But that only worked if the growth was small enough to not overflow and become a big unsigned value. This PR makes us check the other way too, which requires us to know where the stack starts out at. This still supports the old way of just passing in the growth limit. We can remove it after the roll. In principle this can all be done on the LLVM side too after emscripten-core/emscripten#12057 but I'm not sure of the details there, and this is easy to fix here and get testing up (which can help with later LLVM work). This helps emscripten-core/emscripten#11860 by allowing us to clean up some fastcomp-specific stuff in tests.
* Add allowHeavyweight inlining option (#3032)Max Graey2020-08-262-0/+93
| | | | | As discussed in #2921, this allows inlining of functions not identified as "lightweight" (that include a loop, for example).
* wasm-emscripten-finalize: Add flags to limit dynCall creation (#3070)Sam Clegg2020-08-262-0/+30
| | | | | | Two new flags here, one to completely removes dynCalls, and another to limit them to only signatures that contains i64. See #3043
* SAFE_HEAP: remove fastcomp, prepare for new emscripten approach (#3078)Alon Zakai2020-08-252-2/+8
| | | | | | | | | | | | | | | In fastcomp we implemented emscripten_get_sbrk_ptr in wasm, and exported _emscripten_get_sbrk_ptr. We don't need that anymore and can remove it. However I want to switch us to implementing emscripten_get_sbrk_ptr in wasm in upstream too, as part of removing DYNAMICTOP_PTR and other silliness that we have around link (#3043). This makes us support an export of emscripten_get_sbrk_ptr (no prefix), and also it makes sure not to instrument that function, which may contain some memory operations itself, but if we SAFE_HEAP-ify them we'd get infinite recursion, as the SAFE_HEAP methods need to call that.
* also drop size for memory.copy(x, x, y) (#3075)Max Graey2020-08-241-0/+3
| | | This fixes a bug in which a side effect in the calculation of the size could be lost.
* memory.copy: use nop reductions only for ignoreImplicitTraps (#3074)Max Graey2020-08-244-8/+37
| | | | | | | | | According to changes in spec: WebAssembly/bulk-memory-operations#124 WebAssembly/bulk-memory-operations#145 we unfortunately can't fold to nop even for memory.copy(x, y, 0). So this PR revert all reductions to nop but do this only under ignoreImplicitTraps flag
* Remove optimization for memory.copy(x, x, C) (#3073)Max Graey2020-08-232-2/+6
| | | | | That can trap, so we can only remove it if traps are ignored, which was not handled properly. Revert it as we consider the options.
* OptimizeInstructions on memory.copy: check size for side effect as well (#3072)Max Graey2020-08-232-1/+16
| | | Fix issue found by fuzzer: #3038 (comment)
* Optimize bulk memory.copy (#3038)Max Graey2020-08-222-0/+140
| | | Replace it with a load and a store when the size is a small constant and remove it entirely when it would be a nop.
* Skip tests that fail on windows and enable all the rest (#3035)Alon Zakai2020-08-1124-0/+0
| | | | | | | | | | | | | | This lets us run most tests at least on that platform. Add a new function for skipping those tests, skip_if_on_windows, so that it's easy to find which tests are disabled on windows for later fixing efforts. This fixes a few minor issues for windows, like comparisons should ignore \r in some cases. Rename all passes tests that use --dwarfdump to contain "dwarf" in their name, which makes it easy to skip those (and is clearer anyhow).
* DWARF: Fix debug_info references to the abbreviations section (#2997)Alon Zakai2020-08-073-0/+1
| | | | | | | | | | | | | | | | The previous code assumed that each compile unit had its own abbreviation section, and they are all in order. That's normally how LLVM emits things, but in #2992 there is a testcase in which linking of object files with IR files somehow ends up with a different order. The proper fix is to track the binary offsets of abbreviations in the abbreviation section. That section is comprised of null-terminated lists, which each CU has an offset to the beginning of. With those offsets, we can match things properly. Add a testcase that crashes without this, to prevent regressions. Fixes #2992 Fixes #3007
* StubUnsupportedJSOps: Remove CallIndirects (#3027)Alon Zakai2020-08-062-0/+47
| | | | | | wasm2js does not have full call_indirect support as we don't trap if the type is incorrect, which wasm does. Therefore the StubUnsupportedJSOps pass needs to remove those operations so that the fuzzer doesn't find spurious issues.
* Asyncify verbose option (#3022)Alon Zakai2020-08-062-0/+372
| | | | | | | | | | | | | | | | This logs out the decisions made about instrumenting functions, which can help figure out why a function is instrumented, or to get a list of what might need to be. As the test shows, it can print things like this: [asyncify] import is an import that can change the state [asyncify] calls-import can change the state due to import [asyncify] calls-calls-import can change the state due to calls-import [asyncify] calls-calls-calls-import can change the state due to calls-calls-import (the test has calls-calls-calls-import => calls-calls-import => calls-import -> import).
* Add StubUnsupportedJSOps to remove operations that JS does not support (#3024)Alon Zakai2020-08-052-0/+38
| | | | | | | | This doesn't lower them - it just replaces the unsupported operation with a drop. This will be useful for fuzzing, where to compare JS to the correct semantics we must avoid operations where JS is not always accurate. Also fully document the i64 -> f32 conversion issue in JS.
* Move generateDynCallThunks into its own pass. NFC. (#3000)Sam Clegg2020-08-042-0/+37
| | | | | | The core logic is still living in EmscriptenGlueGenerator because its used also by fixInvokeFunctionNames. As a followup we can figure out how to make these more independent.
* AlignmentLowering: Handle all possible cases for i64, f32, f64 (#3008)Alon Zakai2020-07-312-9/+1453
| | | | | | | | | | Previously we only handled i32. That was enough for all real-world code people have run through wasm2js apparently (which is the only place the pass is needed - it lowers unaligned loads to individual loads etc., as unaligned operations fail in JS). Apparently it's pretty rare to have unaligned f32 loads for example. This will be useful in fuzzing wasm2js, as without this we can't compare results to the interpreter (which does alignment properly).
* New Dealign pass: reduce load/store alignment to 1 (#3010)Alon Zakai2020-07-312-0/+44
| | | | | Pretty trivial, but will be useful in wasm2js testing, where we can't assume an incorrectly-aligned load/store will still work, so we'll need to be pessimistic about alignment there.
* Better const fuzzing (#2972)Alon Zakai2020-07-302-405/+341
| | | | | | | | Tweak floating-point numbers with not just a +-1 integer, but also a float in [-1, 1]. Apply a tweak to powers of 2 as well. This found bugs in various codebases, see WebAssembly/spec#1224
* Remove dynCall generated from fpcast-emu (#2995)Sam Clegg2020-07-281-184/+0
| | | | | | | | This is precursor to moving dynCall generation into a pass of its own. It seems to be up to the caller if they want to run dynCall generation either before or after fpcast-emu. Verified that this change does not effect emscripten's wasm2 other other test suite.
* AvoidReinterprets should not remove code around a reinterpret's value's ↵Alon Zakai2020-07-282-0/+33
| | | | | | | | | | | | | | | | | fallthrough (#2989) We can turn a reinterpret of a load into a different load, and so forth, but if the reinterpret has a non-load child with a load fallthrough, that's not good enough - we can't remove the extra code: (reinterpret (block ..extra code.. (load) ) ) That can't be turned into a load of the flipped type.
* Fix the side effects of data.drop (#2996)Alon Zakai2020-07-283-6/+47
| | | | | | | | | | | | | We marked it as readsMemory so that it could be reordered with various things, except for memory.init. However, the fuzzer found that's not quite right, as it has a global side effect - memory.inits that run later can notice that. So it can't be reordered with anything that might affect global side effects from happening, as in the testcase added here (an instruction that may trap cannot be reordered with a data.drop, as it may prevent the data.drop from happening and changing global state). There may be a way to optimize this more carefully that would allow more optimizations, but as this is a rare instruction I'm not sure it's worth more work.
* DWARF: Do not reorder locals in binary writing (#2959)Alon Zakai2020-07-2311-6133/+6859
| | | | | | | | | | | | | | | | | | | | | The binary writer reorders locals unconditionally. I forgot about this, and so when I made DWARF disable optimization passes that reorder, this was left active. Optimally the writer would not do this, and the ReorderLocals pass would. But it looks like we need special logic for tuple locals anyhow, as they expand into multiple locals, so some amount of local order changes seems unavoidable atm. Test changes are mostly just lots of offsets, and can be ignored, but the new test test/passes/dwarf-local-order.* shows the issue. It prints $foo once, then after a roundtrip (showing no reordering), then it strips the DWARF section and prints after another roundtrip (which does show reordering). This also makes us avoid the Stack IR writer if DWARF is present, which matches what we do with source maps. This doesn't prevent any known bugs, but it's simpler this way and debugging + Stack IR opts is not an important combination.
* Optimize select with const arms (#2869)Max Graey2020-07-222-3/+379
| | | | | x ? 1 : 0 => !!x and so forth.
* Add v128 support to instrument locals (#2960)Ng Zhi An2020-07-172-0/+24
| | | | | | In instrument-locals_all-features.wast I added the tests to the end of the file so that the diff of expected output is smaller and easier to read. Otherwise the constants will have to all change since they are order dependent.
* Interpreter: Don't change NaN bits when dividing by 1 (#2958)Alon Zakai2020-07-152-0/+34
| | | | | | | | | | | | | | | | | | It's valid to change NaN bits in that case per the wasm spec, but if we do so then fuzz testcases will fail on the optimization of nan:foo / 1 => nan:foo That is, it is ok to leave the bits as they are, and if we do that then we are consistent with the simple and valid optimization of removing a divide by 1. Found by the fuzzer - looks like on x64 on some float32 NaNs, the bits will actually change (see the testcase). I've seen this on two machines consistently, so it's normal apparently. Disable an old wasm spectest that has been updated in upstream anyhow, but the new test here is even more strict and verifies the interpreter literally changes no bits.
* NoExitRuntime pass: Don't assume arguments have no side effects (#2953)Alon Zakai2020-07-102-5/+171
| | | | | | | | | | This bug was present from the very first version of this pass from 2018, but it went unnoticed until now when a large project broke on it, for some reason after emscripten-core/emscripten#11403 Nothing wrong in that PR, probably just luck that it started to happen there...
* Optimize booleans when argument is negative integer (#2930)Max Graey2020-07-082-1/+22
| | | bool(-x) ==> bool(x)
* DWARF: Never emit (0, 0) to mean an empty span in debug_loc (#2940)Alon Zakai2020-07-013-0/+723
| | | | | | | | | After mapping to the new positions, and after relativizing to the base, if we end up with (0, 0) then we must emit something else, as that would be interpreted as the end of a list. As it is an empty span, the actual value doesn't matter, it just has to be != 0. This can happen if the very first span in a compile unit is an empty span, in which case relative to the base of the compile unit we would have (0, 0).
* DWARF: Always update .debug_loc base offsets (#2936)Alon Zakai2020-06-303-268/+268
| | | | | | | | | | | | | | | | | | | | | .debug_loc entries can have bases: a value that all values after it in the list are relative to. Previously we used to keep the base value as it was, to keep things as similar to the original DWARF as possible. However, if optimizations move code around so that the values after the base are before the base, then the values could no longer be emitted, and we skipped them in effect. This PR makes us always pick a new base for each list. This allows the base to always work for the values after it, but does mean we change the lists quite a lot more. If there is any extra meaning to the original bases here we may lose that, but the DWARF spec doesn't seem to indicate anything like that (however, it isn't clear to me why LLVM then doesn't always choose the maximal base as the code here does - LLVM's values seem oddly arbitrary). Also properly note the base of each compile unit, which previously we just noted the old value, but didn't look at the new one in the new binary being written.
* DWARF: Track sequences so that we can handle reordering within one (#2932)Alon Zakai2020-06-251-644/+637
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we tracked sequence ends, so if an instruction was marked as the end, we'd keep marking it that way in the output. However, if X, Y, Z form a sequence that is then reordered into Z, Y, X then we need to emit the end on X now. To do that, give a "sequence number" to each debug line. Then when emitting, we can tell if two adjacent lines are in a sequence or not, and emit the end properly. This fixes a large partner testcase, allowing llvm-dwarfdump --verify --debug-line to pass on it. With this change it is easier to remove the hackish handling of prologueEnd that we had before, where we reset it. Instead, just emit it when it is set, and that's all. In particular we can get rid of the // Reset the state and resetAfterLine() calls in emitDiff. That function now just emits a diff, with no side effects, and is marked const. This refactoring moves the needToEmit() check to an earlier place. Instead of noting lines we'll never emit, don't even note them at all. The test diff seems large, but it is all due to one small change that then changes all the later offsets: - 0x00000831: 01 DW_LNS_copy - 0x000000000000086e 43 4 1 0 0 is_stmt + 0x00000831: 00 DW_LNE_end_sequence + 0x000000000000086e 43 4 1 0 0 is_stmt end_sequence Note how we add end_sequence there. We used to have an entry right after it with line 0 that was marked as the end of the sequence. In the new code, we don't emit that unnecessary line (which was previously only emitted for the end sequence!) and instead emit the end sequence on the last valid line.
* DWARF: Fix sequence_end emitting (#2929)Alon Zakai2020-06-241-636/+641
| | | | | | | | | | | | | We must emit those, even if otherwise it looks like a line we can omit, as the ends of sequences have important meaning and dwarfdump will warn without them. Looks like fannkuch0 in the test suite already had an example of an incorrectly-omitted sequence_end, so no need for a new testcase. Verified that without this e.g. wasm2.test_exceptions with -g added will lead to a wasm that warns, but with this PR the debug_line section is reported as valid by dwarfdump.
* wasm2js: Avoid 64-bit scratch memory helpers in wasm-intrinsics (#2926)Alon Zakai2020-06-231-11/+8
| | | | | | | | | | | | | | That code originally used memory location 1024 to save 64 bits of data (as that is what rust does apparently). We refactored it manually to instead use a scratch memory helper, which is safer. However, that 64-bit function ends up legalized, which actually changes the interface between the module and the outside, which is confusing and causes problems with optimizations that can remove the getTempRet0 imports, see emscripten-core/emscripten#11456 Instead, just use a global i64 to stash those bits. This requires adding support for copying globals from the intrinsics module, but otherwise seems simpler overall.
* Asyncify liveness analysis (#2890)Alon Zakai2020-06-239-842/+1391
| | | | | | | | | This finds out which locals are live at call sites that might pause/resume, which is the set of locals we need to actually save/load. That is, if a local is not alive at any call site in the function, then it's value doesn't need to stay alive while sleeping. This saves about 10% of locals that are saved/loaded, and about 1.5% in final code size.
* More optimizations for pow of two and pos/neg one const on the right (#2870)Max Graey2020-06-223-113/+511
|
* wasm2js: Bulk memory support (#2923)Alon Zakai2020-06-221-0/+13
| | | | | | | | | | | | | | Adds a special helper functions for data.drop etc., as unlike most wasm instructions these are too big to emit inline. Track passive segments at runtime in var memorySegments whose indexes are the segment indexes. Emit var bufferView even if the memory exists even without memory segments, as we do still need the view in order to operate on it. Also adds a few constants for atomics that will be useful in future PRs (as this PR updates the constant lists anyhow).
* Asyncify: Instrument indirect calls from functions in add-list or only-list ↵Alon Zakai2020-06-172-0/+213
| | | | | | | | | | | | | | | | | | | | | | (#2913) When doing manual tuning of calls using asyncify lists, we want it to be possible to write out all the functions that can be on the stack when pausing, and for that to work. This did not quite work right with the ignore-indirect option: that would ignore all indirect calls all the time, so that if foo() calls bar() indirectly, that indirect call was not instrumented (we didn't check for a pause around it), even if both foo() and bar() were listed. There was no way to make that work (except for not ignoring indirect calls at all). This PR makes the add-list and only-lists fully instrument the functions mentioned in them: both themselves, and indirect calls from them. (Note that direct calls need no special handling - we can just add the direct call target to the add-list or only-list.) This may add some overhead to existing users, but only in a function that is instrumented anyhow, and also indirect calls are slow anyhow, so it's probably fine. And it is simpler to do it this way instead of adding another list for indirect call handling.
* Asyncify: Add an "add list", rename old lists (#2910)Alon Zakai2020-06-126-1/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Asyncify does a whole-program analysis to figure out the list of functions to instrument. In emscripten-core/emscripten#10746 (comment) we realized that we need another type of list there, an "add list" which is a list of functions to add to the instrumented functions list, that is, that we should definitely instrument. The use case in that link is that we disable indirect calls, but there is one special indirect call that we do need to instrument. Being able to add just that one can be much more efficient than assuming all indirect calls in a big codebase need instrumentation. Similar issues can come up if we add a profile-guided option to asyncify, which we've discussed. The existing lists were not good enough to allow that, so a new option is needed. I took the opportunity to rename the old ones to something better and more consistent, so after this PR we have 3 lists as follows: * The old "remove list" (previously "blacklist") which removes functions from the list of functions to be instrumented. * The new "add list" which adds to that list (note how add/remove are clearly parallel). * The old "only list" (previously "whitelist") which simply replaces the entire list, and so only those functions are instrumented and no other. This PR temporarily still supports the old names in the commandline arguments, to avoid immediate breakage for our CI.
* Rename anyref to externref to match proposal change (#2900)Jay Phelps2020-06-1015-59/+59
| | | | | | | anyref future semantics were changed to only represent opaque host values, and thus renamed to externref. [Chromium](https://bugs.chromium.org/p/v8/issues/detail?id=7748#c360) was just updated to today (not yet released). I couldn't find a Mozilla bugzilla ticket mentioning externref so I don't immediately know if they've updated yet. https://github.com/WebAssembly/reference-types/pull/87
* Prevent calls from sinking into 'try' (#2899)Heejin Ahn2020-06-072-0/+70
| | | | Expressions that may throw cannot be sinked into 'try'. At the start of 'try', we drop all sinkables that may throw.
* DeNaN improvements (#2888)Alon Zakai2020-06-032-5/+145
| | | | | | | | | Instead of instrumenting every local.get, instrument parameters on arrival at a function once on entry. After that, every local will always contain a de-naned value (since we would denan on a local.set). This is more efficient and also less confusing I think. Also avoid doing anything to values that fall through as they have already been fixed up.
* Prevent pops from sinking in SimplifyLocals (#2885)Heejin Ahn2020-06-032-0/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This prevents `exnref.pop`s from being sinked and separated from `catch`. For example, ```wast (try (do) (catch (local.set $0 (exnref.pop)) (call $foo (i32.const 3) (local.get $0) ) ) ) ``` Here, if we sink `exnref.pop` to remove `local.set $0` and `local.get $0`, it becomes this: ```wast (try (do) (catch (nop) (call $foo (i32.const 3) (exnref.pop) ) ) ) ``` This move was possible because `i32.const 3` does not have any side effects. But this is incorrect because now `exnref.pop` does not follow right after `catch`. To prevent this, this patch checks this case in `canSink` in SimplifyLocals. When we encountered a similar case in CodeFolding, we prevented every expression that contains `Pop` anywhere in it from being moved, which was too conservative. This adds `danglingPop` property in `EffectAnalyzer`, so that only pops that are not enclosed within a `catch` count as 'dangling pops` and we only prevent those pops from being moved or sinked.
* Prevent calls from escaping try in CodeFolding (#2883)Heejin Ahn2020-06-012-4/+106
| | | | In CodeFolding, we should not take an expression that may throw out of a `try` scope. This patch adds this restriction in `canMove`.
* DeNaN pass (#2877)Alon Zakai2020-05-274-1441/+54
| | | | | | This moves the fuzzer de-NaN logic out into a separate pass. This is cleaner and also better since the old way would de-NaN once, but then the reducer could generate code with nans. The new way lets us de-NaN while reducing.
* Fix DWARF location list updating with nonzero compilation unit base addr ↵Paolo Severini2020-05-273-0/+628
| | | | | | | | | | | | | | | | (#2862) In the .debug_loc section the Start/End address offsets in a location list are relative to the address of the compilation unit that refers that location list. There is a problem in function wasm::Debug:: updateLoc(), which compares these offsets with the actual module addresses of expressions and functions, causing the generation of invalid location lists. The fix is not trivial, because the DWARF debug_loc section does not specify which is the compilation unit associated to each location list entry. A simple workaround is to store, in LocationUpdater, a map of location list offsets to the base address of the compilation units referencing them, and that can be easily calculated in updateDIE().
* Flatten fuzz fix with unreachable special-casing (#2876)Alon Zakai2020-05-278-45/+235
| | | | | | | | | The special-casing of unreachable there could lead to bad behavior, where we did nothing to the unreachable and ended up moving something with side effects before it, see testcase in test/passes/flatten_all-features.wast. This emits less efficient code, but only if --dce was not run earlier, so probably not worth optimizing.
* Remove `Push` (#2867)Thomas Lively2020-05-224-51/+3
| | | | | | Push and Pop have been superseded by tuples for their original intended purpose of supporting multivalue. Pop is still used to represent block arguments for exception handling, but there are no plans to use Push for anything now or in the future.
* Add EH support for SimplifyLocals (#2858)Heejin Ahn2020-05-192-0/+87
| | | | | | | - `br_on_exn`'s target block cannot be optimized to have a separate return value. This handles that in `SimplifyLocals`. - `br_on_exn` and `rethrow` can trap (when the arg is null). This handles that in `EffectAnalyzer`. - Fix a few nits