summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "Fix renaming in FixInvokeFunctionNamesWalker (#2513)" (#2541)Sam Clegg2019-12-191-13/+8
| | | This reverts commit f0a2e2c75c7bb3008f10b6edbb8dc4cfd27b7d28.
* DWARF parsing and writing support using LLVM (#2520)Alon Zakai2019-12-197-0/+256
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This imports LLVM code for DWARF handling. That code has the Apache 2 license like us. It's also the same code used to emit DWARF in the common toolchain, so it seems like a safe choice. This adds two passes: --dwarfdump which runs the same code LLVM runs for llvm-dwarfdump. This shows we can parse it ok, and will be useful for debugging. And --dwarfupdate writes out the DWARF sections (unchanged from what we read, so it just roundtrips - for updating we need #2515). This puts LLVM in thirdparty which is added here. All the LLVM code is behind USE_LLVM_DWARF, which is on by default, but off in JS for now, as it increases code size by 20%. This current approach imports the LLVM files directly. This is not how they are intended to be used, so it required a bunch of local changes - more than I expected actually, for the platform-specific stuff. For now this seems to work, so it may be good enough, but in the long term we may want to switch to linking against libllvm. A downside to doing that is that binaryen users would need to have an LLVM build, and even in the waterfall builds we'd have a problem - while we ship LLVM there anyhow, we constantly update it, which means that binaryen would need to be on latest llvm all the time too (which otherwise, given DWARF is quite stable, we might not need to constantly update). An even larger issue is that as I did this work I learned about how DWARF works in LLVM, and while the reading code is easy to reuse, the writing code is trickier. The main code path is heavily integrated with the MC layer, which we don't have - we might want to create a "fake MC layer" for that, but it sounds hard. Instead, there is the YAML path which is used mostly for testing, and which can convert DWARF to and from YAML and from binary. Using the non-YAML parts there, we can convert binary DWARF to the YAML layer's nice Info data, then convert that to binary. This works, however, this is not the path LLVM uses normally, and it supports only some basic DWARF sections - I had to add ranges support, in fact. So if we need more complex things, we may end up needing to use the MC layer approach, or consider some other DWARF library. However, hopefully that should not affect the core binaryen code which just calls a library for DWARF stuff. Helps #2400
* Fix trapping and dangling insts in memory packing (#2540)Heejin Ahn2019-12-191-4/+14
| | | | | | | | | | | This does two things: - Restore `visitDataDrop` handler deleted in #2529, but now we convert invalid `data.drop`s to not `unreachable` but `nop`. This conforms to the revised spec that `data.drop` on the active segment can be treated as a nop. - Make `visitMemoryInit` trap if offset or size are not equal to 0 or if the dest address is out of bounds. Otherwise drop all its argument. Fixes #2535.
* SIMD {i8x16,i16x8}.avgr_u instructions (#2539)Thomas Lively2019-12-1814-1/+68
| | | As specified in https://github.com/WebAssembly/simd/pull/126.
* Correctly clear memory / table info in clearModule (#2536)Heejin Ahn2019-12-172-2/+17
| | | | | | Currently `ModuleUtils::clearModule` does not clear `exists` flags in the memory and table, and running RoundTrip pass on any module that has a memory or a table fails as a result. This creates `clear` function in `Memory` and `Table` and makes `clearModule` call them.
* Fix renaming in FixInvokeFunctionNamesWalker (#2513)Sam Clegg2019-12-171-8/+13
| | | | | | | | | | | | | This fixes https://github.com/emscripten-core/emscripten/issues/9950. The issue only shows up when debug names are not present so most of the changes in CL come from disabling debug names in the lld tests. We want to make sure that wasm-emscripten-finalize runs fine without debug names so I think it makes most sense to test in this mode. The actual bugfix is in wasm-emscripten.cpp as part of the FixInvokeFunctionNamesWalker. The problem was the name of the function rather than is import name was being added to importRenames. This means that when debug names were present (and the two names were the same) we didn't see the bug.
* Implement 0-len/drop spec changes in bulk memory (#2529)Heejin Ahn2019-12-162-19/+22
| | | | | | | | | | | | | | | | | | | | | This implements recent bulk memory spec changes (WebAssembly/bulk-memory-operations#126) in Binaryen. Now `data.drop` is equivalent to shrinking a segment size to 0, and dropping already dropped segments or active segments (which are thought to be dropped in the beginning) is treated as a no-op. And all bounds checking is performed in advance, so partial copying/filling/initializing does not occur. I tried to implement `visitDataDrop` in the interpreter as `segment.data.clear();`, which is exactly what the revised spec says. I didn't end up doing that because this also deletes all contents from active segments, and there are cases we shouldn't do that: - `wasm-ctor-eval` shouldn't delete active segments, because it will store the changed contents back into segments - When `--fuzz-exec` is given to `wasm-opt`, it runs the module and compare the execution call results before and after transformations. But if running a module will nullify all active segments, applying any transformation to the module or re-running it does not make any sense.
* Improve RoundTrip pass: avoid copying (#2531)Alon Zakai2019-12-161-5/+3
|
* Write wasm/wast files with BINARYEN_PASS_DEBUG=3 (#2527)Heejin Ahn2019-12-131-3/+3
| | | | | Currently `BINARYEN_PASS_DEBUG=3` prints `.wasm` files but they are actually text wast files. This makes `BINARYEN_PASS_DEBUG=3` prints both wasm/wast files, where wasm contains a binary file and wast a text file.
* Remove redundant instructions in Flatten (#2524)Heejin Ahn2019-12-121-17/+23
| | | | | | | When the expression type is none, it does not seem to be necessary to make it a prelude and insert a nop. This also results in unnecessary blocks that contains an expression with a nop, which can be reduced to just the expression. This also adds some newlines to improve readability.
* Support stack overflow checks in standalone mode (#2525)Alon Zakai2019-12-123-5/+22
| | | | | | | | | In normal mode we call a JS import, but we can't import from JS in standalone mode. Instead, just trap in that case with an unreachable. (The error reporting is not as good in this case, but at least it catches all errors and halts, and the emitted wasm is valid for standalone mode.) Helps emscripten-core/emscripten#10019
* Make local.tee's type its local's type (#2511)Heejin Ahn2019-12-1222-57/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the current spec, `local.tee`'s return type should be the same as its local's type. (Discussions on whether we should change this rule is going on in WebAssembly/reference-types#55, but here I will assume this spec does not change. If this changes, we should change many parts of Binaryen transformation anyway...) But currently in Binaryen `local.tee`'s type is computed from its value's type. This didn't make any difference in the MVP, but after we have subtype relationship in #2451, this can become a problem. For example: ``` (func $test (result funcref) (local $0 anyref) (local.tee $0 (ref.func $test) ) ) ``` This shouldn't validate in the spec, but this will pass Binaryen validation with the current `local.tee` implementation. This makes `local.tee`'s type computed from the local's type, and makes `LocalSet::makeTee` get a type parameter, to which we should pass the its corresponding local's type. We don't embed the local type in the class `LocalSet` because it may increase memory size. This also fixes the type of `local.get` to be the local type where `local.get` and `local.set` pair is created from `local.tee`.
* Remove FunctionType (#2510)Thomas Lively2019-12-1162-1477/+848
| | | | | | | | | | | | | | | | | Function signatures were previously redundantly stored on Function objects as well as on FunctionType objects. These two signature representations had to always be kept in sync, which was error-prone and needlessly complex. This PR takes advantage of the new ability of Type to represent multiple value types by consolidating function signatures as a pair of Types (params and results) stored on the Function object. Since there are no longer module-global named function types, significant changes had to be made to the printing and emitting of function types, as well as their parsing and manipulation in various passes. The C and JS APIs and their tests also had to be updated to remove named function types.
* Fix loop parent computation in DataFlow.Graph (#2522)Heejin Ahn2019-12-111-0/+3
| | | | | | This fixes the parent-child relationship computation in `DataFlow.Graph` when there is a loop. This wasn't discovered until now because this is used in Souperify and Souperify only runs after Flatten pass, which produces redundant blocks between inside and outside of a loop.
* Add a RoundTrip pass (#2516)Alon Zakai2019-12-095-2/+92
| | | | | | This pass writes and reads the module. This shows the effects of converting to and back from the binary format, and will be useful in testing dwarf debug support (where we'll need to see that writing and reading a module preserves debug info properly).
* Fix comparison of none and unreachable types (#2514)Heejin Ahn2019-12-091-2/+2
| | | | | | | | | | | | | | | | | | | Currently `none` and `unreachable` types are stored as the same empty `{}` in src/wasm/wasm-type.cpp. This makes `Type::operator<` incorrectly when given `none` and `unreachable`, because it expands both given types and lexicographically compare them, when both of the expanded vector will be empty. This was found by the fuzzer. This line in `Modder::visitExpression` tries to retrieve candidates of the same type. Because we can't really compare these two types, if you give `unreachable` as the key, candidates of `none` type can be returned. This generates incorrect code that ends up failing in validation in a very weird way. It was hard to generate a small testcase to trigger this part because it was found by generating fuzzed code from a random data file. But I guess this fix is pretty straightforward. Fixes #2512.
* Use wat over wast for text format filenames (#2518)Sam Clegg2019-12-0810-17/+13
|
* Don't include `$` with names unless outputting to wat format (#2506)Sam Clegg2019-12-062-20/+26
| | | | | | | | | | | The `$` is not actually part of the name, its the marker that starts a name in the wat format. It can be confusing to see it show up when doing `cerr << name`, for example. This change has Print.cpp add the `$` which seem like the right place to do this. Plus it revealed a bunch of places where were not calling printName to escape all the names we were printing.
* Avoid errors in binaryen.js assertions builds, and enable ASSERTIONS in ↵Alon Zakai2019-12-061-0/+6
| | | | debug builds. (#2507)
* Include in minification all imports from modules starting with `wasi_` (#2509)Sam Clegg2019-12-051-3/+1
| | | | | | This allows us to support not just wasi_unstable but also the new wasi_snapshot_preview1 and beyond. See https://github.com/emscripten-core/emscripten/pull/9956
* Add some tracing to wasm-emscripten-finalize (#2505)Sam Clegg2019-12-053-9/+30
| | | | | Also fix, but in splitting the names of the trace channels. Obviously I can't write string.split correctly in C first time around.
* Add string parameter to WASM_UNREACHABLE (#2499)Sam Clegg2019-12-0556-420/+450
| | | | | This works more like llvm's unreachable handler in that is preserves information even in release builds.
* Add BYN_ENABLE_ASSERTSION option to allow assertions to be disabled. (#2500)Sam Clegg2019-12-049-6/+29
| | | | | | | | We always enable assertions by default, but this options allows for a build without them. Fix all errors in the ASSERTIONS=OFF build, even though we don't normally build this its good to keep it building.
* Fix metadce debug info after #2497 (#2501)Sam Clegg2019-12-041-0/+1
| | | This like was mistakenly removed as part of the BYN_TRACE conversion.
* Remove 'none' type as a branch target in ReFinalize (#2492)Alon Zakai2019-12-0413-100/+21
| | | | | | | | | | | | | | | | | That was needed for super-old wasm type system, where we allowed (block $x (br_if $x (unreachable) (nop) ) ) That is, we differentiated "taken" branches from "named" ones (just referred to by name, but not actually taken as it's in unreachable code). We don't need to differentiate those any more. Remove the ReFinalize code that considered it, and also remove the named/taken distinction in other places.
* cmake: Convert to using lowercase for and functions/macros (#2495)Sam Clegg2019-12-047-14/+14
| | | This is line with modern cmake conventions is much less SHOUTY!
* Convert to using DEBUG macros (#2497)Sam Clegg2019-12-0420-647/+272
| | | | | | This means that debugging/tracing can now be enabled and controlled centrally without managing and passing state around the codebase.
* Add BYN_DEBUG/BYN_TRACE macros similar to LLVM's debug system (#2496)Sam Clegg2019-12-045-19/+127
| | | | | | | | | This allows for debug trace message to be split my channel. So you can pass `--debug` to simply debug everything, or `--debug=opt` to only debug wasm-opt. This change is the initial introduction but as a followup I hope to convert all tracing over to this new system so we can more easily control the debug output.
* Add Emscripten memory helpers for using the C-API (from Wasm) (#2476)Daniel Wirtz2019-12-031-1/+55
| | | | | | | | | | | | We already have exports for _malloc and _free in the Emscripten build, but there is no way yet to initialize the data without resorting to JS. Hence this PR adds a few additional memory helpers to the Emscripten build so it becomes possible to manipulate Binaryen memory without the need for extra glue code, for example when Binaryen is a WebAssembly import, and one is allocating strings to be used by / reading strings returned by Binaryen. I expect this to be a bit controversial because the use case is relatively specific, but it makes sense for us because we are consuming the C-API directly (from JS and eventually Wasm) and don't rely on binaryen.js-post.js.
* Refactor removing module elements (#2489)Heejin Ahn2019-12-025-102/+70
| | | | | | | | | | | This creates utility functions for removing module elements: removing one element by name, and removing multiple elements using a predicate function. And makes other parts of code use it. I think this is a light-handed approach than calling `Module::updateMaps` after removing only a part of module elements. This also fixes a bug in the inlining pass: it didn't call `Module::updateMaps` after removing functions. After this patch callers don't need to additionally call it anyway.
* Update spec test suite (#2484)Heejin Ahn2019-11-293-4/+7
| | | | | | | | | | | | | This updates spec test suite to that of the current up-to-date version of https://github.com/WebAssembly/spec repo. - All failing tests are added in `BLACKLIST` in shared.py with reasons. - For tests that already existed and was passing and started failing after the update, we add the new test to the blacklist and preserve the old file by renaming it to 'old_[FILENAME].wast' not to lose test coverage. When the cause of the error is fixed or the unsupported construct gets support so the new test passes, we can delete the corresponding 'old_[FILENAME].wast' file. - Adds support for `spectest.print_[type] style imports.
* Collect all object files from the object libraries in a CMake variable (#2477)Immanuel Haffner2019-11-267-8/+7
| | | | | | | | | using the `$<TARGET_OBJECTS:objlib>` syntax. Use this variable when adding `libbinaryen` as static or shared library. Additionally, use the variable with the object files to simplify the `TARGET_LINK_LIBRARIES` commands: add the object libraries to the sources of executables and drop the use of our libraries in `TARGET_LINK_LIBRARIES`. (Object libraries cannot be linked but must be used as sources. See https://cmake.org/pipermail/cmake/2018-June/067721.html)
* Refactor and optimize binary writing type collection (#2478)Alon Zakai2019-11-264-81/+105
| | | | | | | | | | Create a new ParallelFunctionAnalysis helper, which lets us run in parallel on all functions and collect info from them, without manually handling locks etc. Use that in the binary writing code's type collection logic, avoiding a lock for each type increment. Also add Signature printing which was useful to debug this.
* Update type information for em_asm functions (#2480)Thomas Lively2019-11-261-1/+3
| | | | | | | | | We were only updating the imported Function's type name field and failing to update its params and results. This caused the binary writer to start using the wrong types after #2466. This PR fixes the code to update both type representations on the imported function. This double bookkeeping will be removed entirely in an upcoming PR.
* Use opaque types for handle references in C API (#2473)Ingvar Stepanyan2019-11-262-15/+34
| | | | This improves typechecking by verifying that user passes pointers of correct types.
* Print only literal values when printing literals (#2469)Heejin Ahn2019-11-263-4/+6
| | | | | | | | | | | | | | | Current `<<` operator on `Literal` prints `[type].const` with it. But `[type].const` is rather an instruction than a literal itself, and printing it with the literals makes less sense when we later have literals whose type don't have `const` instructions (such as reference types). This patch - Makes `<<` operator on `Literal` print only its value - Makes wasm-shell's shell interface comply with the spec interpreter's printing format (`value : type`). - Prints wasm-shell's `[trap]` message to stderr These make all `fix_` routines for spec tests in check.py unnecessary.
* Revert "Build libbinaryen as a monolithic statically/shared library (#2463)" ↵Alon Zakai2019-11-257-7/+7
| | | | | (#2474) This reverts commit bf8f36c31c0b8e6213bce840be66937dd6d0f6af.
* Remove FunctionType from Event (#2466)Thomas Lively2019-11-2522-223/+252
| | | | | | | | | This is the start of a larger refactoring to remove FunctionType entirely and store types and signatures directly on the entities that use them. This PR updates BrOnExn and Events to remove their use of FunctionType and makes the BinaryWriter traverse the module and collect types rather than using the global FunctionType list. While we are collecting types, we also sort them by frequency as an optimization. Remaining uses of FunctionType in Function, CallIndirect, and parsing will be removed in a future PR.
* Build libbinaryen as a monolithic statically/shared library (#2463)Immanuel Haffner2019-11-227-7/+7
| | | | | | | | | | | | * Transform libraries created in subdirectories from statically linked libraries to CMake object libraries. * Link object libraries as `PRIVATE` to `libbinaryen`. According to CMake documentation: "Libraries and targets following PRIVATE are linked to, but are not made part of the link interface." This is exactly what we want, as we only want the C API to be part of the interface.
* Multivalue type creation and inspection (#2459)Thomas Lively2019-11-2243-265/+451
| | | | | | | | | | | | | Adds the ability to create multivalue types from vectors of concrete value types. All types are transparently interned, so their representation is still a single uint32_t. Types can be extracted into vectors of their component parts, and all the single value types expand into vectors containing themselves. Multivalue types are not yet used in the IR, but their creation and inspection functionality is exposed and tested in the C and JS APIs. Also makes common type predicates methods of Type and improves the ergonomics of type printing.
* Add a pass to inline __original_main() into main() (#2461)Alon Zakai2019-11-214-3/+44
| | | | | | | | | | | | | | | | | | clang/llvm introduce __original_main as a workaround for the fact that main may have different signatures. A downside to that is that users get it in stack traces, which is confusing. In -O2 and above we normally inline __original_main anyhow, but as this is for debugging, non-optimized builds matter too, so add a pass for this. The implementation is trivial, just call doInling. However we must check some corner cases first. Bonus minor fixes to FindAllPointers, which unnecessarily created an object to get the class Id (which is not valid for all classes), and that it didn't take the input by reference properly, which meant we couldn't get the pointer to the function body's toplevel.
* Add a --strip-dwarf pass (#2454)Alon Zakai2019-11-194-6/+16
| | | | | | | | | | | | | This pass strips DWARF debug sections, but not other debug sections. This is useful when emitting source maps, as we do need the SourceMapURL section, but the DWARF sections are not longer necessary (and we've seen a testcase where they are massively large, so big the wasm can't even be loaded in a browser...). Also contains a trivial one-line fix in --extract-function which was necessary to create the testcase here: that pass extracts a function from a wasm file (like llvm-extract) but it didn't check if an export already existed for the function.
* Add PostAssemblyScript pass (#2407)Daniel Wirtz2019-11-195-0/+649
| | | | | Adds the AssemblyScript-specific passes post-assemblyscript and post-assemblyscript-finalize, eliminating redundant ARC-style retain/release patterns conservatively emitted by the compiler.
* Optimize away invoke_ calls where possible (#2442)Alon Zakai2019-11-191-0/+81
| | | | | | | | | | | | When we see invoke_ calls in emscripten-generated code, we know they call into JS just to do a try-catch for exceptions. If the target being called cannot throw, which we check in a whole-program manner, then we can simply skip the invoke. I confirmed that this fixes the regression in emscripten-core/emscripten#9817 (comment) (that is, with this optimization, upstream is around as fast as fastcomp). When we have native wasm exception handling, this can be extended to optimize that as well.
* Refactor a CallGraphPropertyAnalysis helper [NFC] (#2441)Alon Zakai2019-11-182-61/+103
| | | | | | | | | | | This moves code out of Asyncify into a general helper class. The class automates scanning the functions for a property, then propagating it to functions that call them. In Asyncify, the property is "may call something that leads to sleep", and we propagate backwards to callers, to find all those that may sleep. This will be useful in a future exceptions-optimizing pass I want to write, where the property will be "may throw". We will then be able to remove exceptions overhead in cases that definitely do not throw.
* Fix #2430 properly (#2449)Alon Zakai2019-11-181-10/+10
|
* Warning improvements (#2438)Alon Zakai2019-11-152-4/+17
| | | | | | | | If wasm-opt is run with no passes, warn, as we've gotten reports that people assume a tool called "wasm-opt" should optimize automatically (but we follow llvm's opt convention of not doing so). Add a --quiet (-q) flag that suppresses this minor warning, and the other minor warning where there is no output file.
* Reuse BINARYEN_API for Emscripten builds (#2435)Daniel Wirtz2019-11-131-1/+4
| | | | | | | | | | | | This is an alternative to #2361 in that it only implements reusing BINARYEN_API so we don't have to list all the functions in build-js.sh. Differs in that it keeps the sh file relatively straight forward without going overboard with bash functionality. Also adds various quotes in case of whitespace in paths and makes it so that *.sh files always use LF line endings to ease Windows support. For instance, I am pulling the repository in Windows but compile in WSL, which, if Git isn't properly configured to check out line endings as-is, would otherwise break the sh files. Fixes #2361.
* [NFC] Make Type a class instead of enum (#2433)Thomas Lively2019-11-132-14/+52
| | | | | | | | | The plan is to extend `Type` to represent arbitrary multivalue types, and as a prerequisite for that it is necessary to make it a class instead of an enum. This PR bends over backwards to add all the automatic conversions and constants necessary to allow the rest of the code to compile unmodified, but in the future it should be possible to standardize usage across the code base and remove some of these utilities.
* uint32_t instead of int64_t as return type for GetMemorySegmentByteOffset ↵COFFEETALES2019-11-122-3/+3
| | | | | (#2432) `uint32_t` instead of `int64_t` as return type for `GetMemorySegmentByteOffset` and minor fixes on tests.