summaryrefslogtreecommitdiff
path: root/test/lld
Commit message (Collapse)AuthorAgeFilesLines
* Use empty blocks instead of nops for empty scopes in IRBuilder (#7080)Thomas Lively2024-11-1420-27/+0
| | | | | | | | | | When IRBuilder builds an empty non-block scope such as a function body, an if arm, a try block, etc, it needs to produce some expression to represent the empty contents. Previously it produced a nop, but change it to produce an empty block instead. The binary writer and printer have special logic to elide empty blocks, so this produces smaller output. Update J2CLOpts to recognize functions containing empty blocks as trivial to avoid regressing one of its tests.
* Don't strip target features in wasm-emscripten-finalize (#7043)Derek Schuff2024-10-301-0/+1
| | | | | This makes the behavior consistent with emcc builds where we don't run finalization, and potentially makes testing and debugging easier. Emscripten still strips the target features section when optimizing.
* Make source parser consistent with binary parser when naming things. NFC (#6813)Sam Clegg2024-08-062-6/+6
| | | | | The `timport$` prefix is already used for tables, so the binary parser currently uses `eimport$` to name tags (I guess because they are normally exception tags?).
* [Parser] Enable the new text parser by default (#6371)Thomas Lively2024-04-258-16/+16
| | | | | | | | | | | | | | The new text parser is faster and more standards compliant than the old text parser. Enable it by default in wasm-opt and update the tests to reflect the slightly different results it produces. Besides following the spec, the new parser differs from the old parser in that it: - Does not synthesize `loop` and `try` labels unnecessarily - Synthesizes different block names in some cases - Parses exports in a different order - Parses `nop`s instead of empty blocks for empty control flow arms - Does not support parsing Poppy IR - Produces different error messages - Cannot parse `pop` except as the first instruction inside a `catch`
* Do not repeat types names in text output (#6499)Thomas Lively2024-04-161-8/+8
| | | | | | | | | | For types that do not have explicit names, we generate index-based names in the printer. However, we did not previously ensure that the generated types were not already used as explicit names, so it was possible to print the same name for multiple types, which is not valid. Fix the problem by skipping indices that are already used as type names. Fixes #6492.
* Require `then` and `else` with `if` (#6201)Thomas Lively2024-01-044-22/+44
| | | | | | | | | | | | We previously supported (and primarily used) a non-standard text format for conditionals in which the condition, if-true expression, and if-false expression were all simply s-expression children of the `if` expression. The standard text format, however, requires the use of `then` and `else` forms to introduce the if-true and if-false arms of the conditional. Update the legacy text parser to require the standard format and update all tests to match. Update the printer to print the standard format as well. The .wast and .wat test inputs were mechanically updated with this script: https://gist.github.com/tlively/85ae7f01f92f772241ec994c840ccbb1
* Use the standard shared memory text format (#6200)Thomas Lively2024-01-031-1/+1
| | | | | Update the legacy text parser and all tests to use the standard text format for shared memories, e.g. `(memory $m 1 1 shared)` rather than `(memory $m (shared 1 1))`. Also remove support for non-standard in-line "data" or "segment" declarations. This change makes the tests more compatible with the new text parser, which only supports the standard format.
* [wasm-emscripten-finalize] Remove --separate-data-segments (#6091)Sam Clegg2023-11-274-95/+0
| | | See #6088
* Simplify and consolidate type printing (#5816)Thomas Lively2023-08-2418-110/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When printing Binaryen IR, we previously generated names for unnamed heap types based on their structure. This was useful for seeing the structure of simple types at a glance without having to separately go look up their definitions, but it also had two problems: 1. The same name could be generated for multiple types. The generated names did not take into account rec group structure or finality, so types that differed only in these properties would have the same name. Also, generated type names were limited in length, so very large types that shared only some structure could also end up with the same names. Using the same name for multiple types produces incorrect and unparsable output. 2. The generated names were not useful beyond the most trivial examples. Even with length limits, names for nontrivial types were extremely long and visually noisy, which made reading disassembled real-world code more challenging. Fix these problems by emitting simple indexed names for unnamed heap types instead. This regresses readability for very simple examples, but the trade off is worth it. This change also reduces the number of type printing systems we have by one. Previously we had the system in Print.cpp, but we had another, more general and extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the old type printing system from Print.cpp and replace it with a much smaller use of the new system. This requires significant refactoring of Print.cpp so that PrintExpressionContents object now holds a reference to a parent PrintSExpression object that holds the type name state. This diff is very large because almost every test output changed slightly. To minimize the diff and ease review, change the type printer in wasm-type.cpp to behave the same as the old type printer in Print.cpp except for the differences in name generation. These changes will be reverted in much smaller PRs in the future to generally improve how types are printed.
* Use Names instead of indices to identify segments (#5618)Thomas Lively2023-04-0417-35/+35
| | | | | | | | | | All top-level Module elements are identified and referred to by Name, but for historical reasons element and data segments were referred to by index instead. Fix this inconsistency by using Names to refer to segments from expressions that use them. Also parse and print segment names like we do for other elements. The C API is partially converted to use names instead of indices, but there are still many functions that refer to data segments by index. Finishing the conversion can be done in the future once it becomes necessary.
* Correctly handle escapes in string constants (#5070)Thomas Lively2022-09-221-1/+1
| | | | | | | Previously when we parsed `string.const` payloads in the text format we were using the text strings directly instead of un-escaping them. Fix that parsing, and while we're editing the code, also add support for the `\r` escape allowed by the spec. Remove a spurious nested anonymous namespace and spurious `static`s in Print.cpp as well.
* Remove metadata generation from wasm-emscripten-finalize (#4863)Sam Clegg2022-08-0735-975/+0
| | | | This is no longer needed by emscripten as of: https://github.com/emscripten-core/emscripten/pull/16529
* wasm-emscripten-finalize: Remove em_js/em_asm start/stop symbols when ↵Sam Clegg2022-08-051-2/+0
| | | | | | | | stripping data segments. (#4876) This avoid a fatal crash in `--post-emscripten` where it tries to remove data that is no longer part of the file. This fixes bug introduced by #4871 that causes emscripten tests to fail.
* Cleanup em_asm/em_js strings as part of PostEmscripten (#4871)Sam Clegg2022-08-049-31/+65
| | | | Rather than doing it as a side effect of dumping the metadata in wasm-emscripten-finalize.
* Re-run scripts/test/generate_lld_tests.py. NFC (#4861)Sam Clegg2022-08-0218-104/+132
|
* First class Data Segments (#4733)Ashley Nelson2022-06-217-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Updating wasm.h/cpp for DataSegments * Updating wasm-binary.h/cpp for DataSegments * Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal * checking isPassive when copying data segments to know whether to construct the data segment with an offset or not * Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp * Updated wasm-interpreter * First look at updating Passes * Updated wasm-s-parser * Updated files in src/ir * Updating tools files * Last pass on src files before building * added visitDataSegment * Fixing build errors * Data segments need a name * fixing var name * ran clang-format * Ensuring a name on DataSegment * Ensuring more datasegments have names * Adding explicit name support * Fix fuzzing name * Outputting data name in wasm binary only if explicit * Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames * Pass on when data segment names are explicitly set * Ran auto_update_tests.py and check.py, success all around * Removed an errant semi-colon and corrected a counter. Everything still passes * Linting * Fixing processing memory names after parsed from binary * Updating the test from the last fix * Correcting error comment * Impl kripken@ comments * Impl tlively@ comments * Updated tests that remove data print when == 0 * Ran clang format * Impl tlively@ comments * Ran clang-format
* wasm-emscripten-finalize: Improve detection of mainReadsParams (#4701)Sam Clegg2022-05-312-2/+2
| | | | | | The first way to should detect this is if the main function actually doesn't take any params. They we fallback to looking deeper. In preparation for https://reviews.llvm.org/D75277
* Update StackCheck for memory64 (#4636)Sam Clegg2022-05-041-1/+1
|
* StackCheck: Add argument stack-check-handler call (#4471)Sam Clegg2022-01-212-8/+21
| | | | | | | This function call now takes the address (which by defintion is outside of the stack range) that the program was attempting to set SP to. This allows emscripten to provide a more useful error message on stack over/under flow.
* Add --no-emit-metadata option to wasm-emscripten-finalize (#4450)Sam Clegg2022-01-192-0/+50
| | | | | | This is useful for the case where we might want to finalize without extracting metadata. See: https://github.com/emscripten-core/emscripten/pull/15918
* Escape \t as well as \n when writing JSON output. (#4437)Sam Clegg2022-01-104-17/+17
| | | | | | | | As it happens, this doesn't (normally) break the resulting EM_ASM or EM_JS strings because (IIUC) JS supports the tab literal inside of strings as well as "\t". However, it's better to preserve the original text so that it looks the same in the JS file as it did in the original source.
* Auto-regenerate lld tests and expectations (#4434)Sam Clegg2022-01-108-133/+134
| | | | | | | | | This change was generated by running: ./scripts/test/generate_lld_tests.py and ./auto_update_tests.py lld
* Remove tableSize from emscripten metadata (#4415)Sam Clegg2021-12-2833-33/+0
| | | See https://github.com/emscripten-core/emscripten/pull/15855
* Clang-format c/cpp files in test directory (#4192)Heejin Ahn2021-09-294-20/+11
| | | | | | | | | This clang-formats c/cpp files in test/ directory, and updates clang-format-diff.sh so that it does not ignore test/ directory anymore. bigswitch.cpp is excluded from formatting, because there are big commented-out code blocks, and apparently clang-format messes up formatting in them. Also to make matters worse, different clang-format versions do different things on those commented-out code blocks.
* Remove Type ordering (#3793)Thomas Lively2021-05-1820-47/+47
| | | | | | | | | As found in #3682, the current implementation of type ordering is not correct, and although the immediate issue would be easy to fix, I don't think the current intended comparison algorithm is correct in the first place. Rather than try to switch to using a correct algorithm (which I am not sure I know how to implement, although I have an idea) this PR removes Type ordering entirely. In places that used Type ordering with std::set or std::map because they require deterministic iteration order, this PR uses InsertOrdered{Set,Map} instead.
* Fix element segment ordering in Print (#3818)Abbas Mashayekh2021-04-203-1/+3
| | | | | | | | | | | We used to print active element segments right after corresponding tables, and passive segments came after those. We didn't print internal segment names, and empty segments weren't being printed at all. This meant that there was no way for instructions to refer to those table segments after round tripping. This will fix those issues by printing segments in the order they were defined, including segment names when necessary and not omitting empty segments anymore.
* LegalizeJSInterface: Remove illegal imports once they are no longer used (#3815)Sam Clegg2021-04-161-2/+1
| | | | | | | | | This prevents used imports which also happen to have duplicate names and therefore cannot be provided by wasm (JS is happen to fill these in with polymorphic JS functions). I noticed this when working on emscripten and directly hooking modules together. I was seeing failures, but not in release builds (because wasm-opt would mop these up in release builds).
* Rename emscripten metadata key to reflect new unmangled names (#3813)Sam Clegg2021-04-1533-33/+33
| | | | | | Turns out just removing the mangling wasn't enough for emscripten to support both before and after versions. See https://github.com/WebAssembly/binaryen/pull/3785
* Remove renaming of __wasm_call_ctors (#3811)Sam Clegg2021-04-154-8/+8
| | | See https://github.com/emscripten-core/emscripten/issues/13893
* Remove final remnants of name mangling from wasm-emscripten (#3785)Sam Clegg2021-04-1510-41/+41
| | | See https://github.com/emscripten-core/emscripten/pull/13847
* Reorder global definitions in Print pass (#3770)Abbas Mashayekh2021-04-0222-44/+44
| | | | This is needed to make sure globals are printed before element segments, where `global.get` can appear both as offset and an expression.
* Fix LegalizeJSInterface with RefFuncs (#3749)Alon Zakai2021-03-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | This code used to remove functions it no longer thinks are needed. That is, if it adds a legalized version of an import, it would remove the illegal one which is no longer needed. To avoid removing an illegal import that is still used it checked for ref.func appearances. But this was bad in two ways: We need to legalize the ref.funcs too. We can't call an illegal import in any way, not using a direct call, indirect call, or call by reference of a ref.func. It's silly to remove unneeded functions here. We have a pass for that. This removes the removal of functions, and adds proper updating of ref.calls, which means to call the stub function that looks like the original import, but that calls the legalized one and connects things up properly, exactly the same way as other calls. Also remove code that checked if we were in the stub/thunk and to not replace the call there. That code is not needed: no one will ever call the illegal import, so we do not need to be careful about preserving such calls.
* Remove passive keyword from data segment parser (#3757)Abbas Mashayekh2021-03-303-7/+7
| | | | | | | | The passive keyword has been removed from spec's text format, and now any data segment that doesn't have an offset is considered as passive. This PR remove that from both parser and the Print pass, plus all tests that used that syntax. Fixes #2339
* wasm-emscripten-finalize: Do not skip the start function body (#3714)Alon Zakai2021-03-223-0/+12935
| | | | | | When we can skip function bodies, we still need to parse the start function for the pthreads case, see details in the comments. This still gives us 99% of the speedup as the start function is just 1 function and it's not that big, so with this we return to full speed after the reversion in #3705
* Remove old AsmConstWalker code (#3685)Sam Clegg2021-03-122-1/+5
|
* Regenerate lld tests (#3684)Sam Clegg2021-03-1225-183/+212
| | | | | | | | | | | This change as automatically generated by: $ ./scripts/test/generate_lld_tests.py $ ./auto_update_tests.py --binaryen-bin=../binaryen-out/bin lld The changes here are mostly due to: - llvm now emits names for globals and segments - emscripten now packs EM_ASM consts into a single contiguous segment
* Properly use text format type names in printing (#3591)Alon Zakai2021-02-237-27/+27
| | | | | | | | | | | | | | | | | | | This adds a TypeNames entry to modules, which can store names for types. So far this PR uses that to store type names from text format. Future PRs will add support for field names and for the binary format. (Field names are added to wasm.h here to see if we agree on this direction.) Most of the work here is threading a module through the various functions in Print.cpp. This keeps the module optional, so that we can still print an expression independently of a module, which has always been the case, and which I think we should keep (but, if a module was mandatory perhaps this would be a little simpler, and could be refactored into a form that depends on that). 99% of this diff are test updates, since almost all our tests use the text format, and many of them specify a type name but we used to ignore it. This is a step towards a proper solution for #3589
* Simplify asmConst handling. NFC. (#3558)Sam Clegg2021-02-096-18/+18
| | | | | | | | Support for multiple signatures per JS code string was removed in #2422. emscripten now only needs to know that address and the body of the JS function. See https://github.com/emscripten-core/emscripten/pull/13452.
* finalize: remove initializers from metadata output (#3479)Sam Clegg2021-01-1117-51/+0
| | | See https://github.com/emscripten-core/emscripten/pull/13208
* Fixed wasm-emscripten-finalize AsmConstWalker not handling 64-bit pointers ↵Wouter van Oortmerssen2020-12-143-0/+175
| | | | | (#3431) Also improved the LLD test scripts to accomodate 64-bit tests.
* Intern HeapTypes and clean up types code (#3428)Thomas Lively2020-12-071-1/+1
| | | | | | | | | Interns HeapTypes using the same patterns and utilities already used to intern Types. This allows HeapTypes to efficiently be compared for equality and hashed, which may be important for very large struct types in the future. This change also has the benefit of increasing symmetry between the APIs of Type and HeapType, which will make the developer experience more consistent. Finally, this change will make TypeBuilder (#3418) much simpler because it will no longer have to introduce TypeInfo variants to refer to HeapTypes indirectly.
* Introduce lit/FileCheck tests (#3367)Thomas Lively2020-11-182-51/+0
| | | | | | | | | | | | | | | lit and FileCheck are the tools used to run the majority of tests in LLVM. Each lit test file contains the commands to be run for that test, so lit tests are much more flexible and can be more precise than our current ad hoc testing system. FileCheck reads expected test output from comments, so it allows test output to be written alongside and interspersed with test input, making tests more readable and precise than in our current system. This PR adds a new suite to check.py that runs lit tests in the test/lit directory. A few tests have been ported to demonstrate the features of the new test runner. This change is motivated by a need for greater flexibility in testing wasm-split. See #3359.
* wasm-emscripten-finalize: Remove staticBump from metadata (#3300)Sam Clegg2020-10-2943-188/+110
| | | | | | Emscripten no longer needs this information as of https://github.com/emscripten-core/emscripten/pull/12643. This also removes the need to export __data_end.
* Remove support for emscripten legacy PIC ABI (#3299)Sam Clegg2020-10-2910-229/+76
|
* Remove now-redundant stack pointer manipulation passes (#3251)Sam Clegg2020-10-188-21/+14
| | | | The use of these passes was removed on the emscripten side in https://github.com/emscripten-core/emscripten/pull/12536.
* finalize: remove legacy support for "table" import (#3249)Sam Clegg2020-10-165-5/+5
| | | | | These days we always export the table, except in the case of dynamic linking, and even then we use the name `__indirect_function_table`.
* Assign import names consistently between text and binaryn reader (#3238)Sam Clegg2020-10-141-15/+15
| | | | | | | | | The s-parser was assigning numbers names per-type where as the binaryn reader was using the global import count as the number to append. This change switches to use per-element count which I think it preferable as it increases the stability of the auto-generated names. e.g. memory is now always named `$mimport0`.
* EmscriptenPIC: Remove internalization of GOT entries (#3211)Sam Clegg2020-10-138-184/+39
| | | | | | | wasm-ld now does this better than binaryen and does it by default when linking and executable and optionally with `-Bsymbolic` when linking a shared library. See https://reviews.llvm.org/D89152
* Re-generate lld test inputs (#3212)Sam Clegg2020-10-094-42/+36
| | | | | | Generated by running: ./scripts/test/generate_lld_tests.py ./auto_update_tests.py
* Let GenerateDynCalls generate dynCalls for invokes (#3192)Heejin Ahn2020-10-022-10/+30
| | | | | | This moves dynCall generating functionaity for invokes from `EmscriptenGlueGenerator` to `GenerateDynCalls` pass. So now `GenerateDynCalls` pass will take care of all cases we need dynCalls: functions in tables and invokes.