summaryrefslogtreecommitdiff
path: root/test/example/module-splitting.txt
Commit message (Collapse)AuthorAgeFilesLines
* Use empty blocks instead of nops for empty scopes in IRBuilder (#7080)Thomas Lively2024-11-141-4/+0
| | | | | | | | | | When IRBuilder builds an empty non-block scope such as a function body, an if arm, a try block, etc, it needs to produce some expression to represent the empty contents. Previously it produced a nop, but change it to produce an empty block instead. The binary writer and printer have special logic to elide empty blocks, so this produces smaller output. Update J2CLOpts to recognize functions containing empty blocks as trivial to avoid regressing one of its tests.
* [wasm-split] Run RemoveUnusedElements on secondary modules (#6945)Thomas Lively2024-09-171-93/+12
| | | | | | | | | Rather than analyze what module elements from the primary module a secondary module will need, the splitting logic conservatively imports all module elements from the primary module into the secondary module. Run RemoveUnusedElements on the secondary module to remove any of these imports that happen to be unnecessary. Leave a TODO mentioning the possibility of being more selective about which module elements get exported to reduce code size in the primary module, too.
* Add a --preserve-type-order option (#6916)Thomas Lively2024-09-101-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike other module elements, types are not stored on the `Module`. Instead, they are collected by traversing the IR before printing and binary writing. The code that collects the types tries to optimize the order of rec groups based on the number of times each type is used. As a result, the output order of types generally has no relation to the input order of types. In addition, most type optimizations rewrite the types into a single large rec group, and the order of types in that group is essentially arbitrary. Changes to the code for counting type uses, sorting types, or sorting rec groups can yield very large changes in the output order of types, producing test diffs that are hard to review and potentially harming the readability of tests by moving output types away from the corresponding input types. To help make test output more stable and readable, introduce a tool option that causes the order of output types to match the order of input types as closely as possible. It is implemented by having the parsers record the indices of the input types on the `Module` just like they already record the type names. The `GlobalTypeRewriter` infrastructure used by type optimizations associates the new types with the old indices just like it already does for names and also respects the input order when rewriting types into a large recursion group. By default, wasm-opt and other tools clear the recorded type indices after parsing the module, so their default behavior is not modified by this change. Follow-on PRs will use the new flag in more tests, which will generate large diffs but leave the tests in stable, more readable states that will no longer change due to other changes to the optimizing type sorting logic.
* [wasm-split] Use a fresh table when reference types are enabled (#6726)Thomas Lively2024-07-111-58/+183
| | | | | | | Rather than trying to trampoline primary-to-secondary calls through an existing table, just create a fresh table for this purpose. This ensures that modifications to the existing tables cannot interfere with primary-to-secondary calls and conversely that loading the secondary module cannot overwrite modifications to the tables.
* Use the standard shared memory text format (#6200)Thomas Lively2024-01-031-9/+9
| | | | | Update the legacy text parser and all tests to use the standard text format for shared memories, e.g. `(memory $m 1 1 shared)` rather than `(memory $m (shared 1 1))`. Also remove support for non-standard in-line "data" or "segment" declarations. This change makes the tests more compatible with the new text parser, which only supports the standard format.
* Simplify and consolidate type printing (#5816)Thomas Lively2023-08-241-255/+255
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When printing Binaryen IR, we previously generated names for unnamed heap types based on their structure. This was useful for seeing the structure of simple types at a glance without having to separately go look up their definitions, but it also had two problems: 1. The same name could be generated for multiple types. The generated names did not take into account rec group structure or finality, so types that differed only in these properties would have the same name. Also, generated type names were limited in length, so very large types that shared only some structure could also end up with the same names. Using the same name for multiple types produces incorrect and unparsable output. 2. The generated names were not useful beyond the most trivial examples. Even with length limits, names for nontrivial types were extremely long and visually noisy, which made reading disassembled real-world code more challenging. Fix these problems by emitting simple indexed names for unnamed heap types instead. This regresses readability for very simple examples, but the trade off is worth it. This change also reduces the number of type printing systems we have by one. Previously we had the system in Print.cpp, but we had another, more general and extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the old type printing system from Print.cpp and replace it with a much smaller use of the new system. This requires significant refactoring of Print.cpp so that PrintExpressionContents object now holds a reference to a parent PrintSExpression object that holds the type name state. This diff is very large because almost every test output changed slightly. To minimize the diff and ease review, change the type printer in wasm-type.cpp to behave the same as the old type printer in Print.cpp except for the differences in name generation. These changes will be reverted in much smaller PRs in the future to generally improve how types are printed.
* Print function types on function imports in the text format (#5727)Alon Zakai2023-05-171-47/+47
| | | | The function type should be printed there just like for non-imported functions.
* Add a name hint to getValidName() (#5653)Alon Zakai2023-04-111-2/+2
| | | | | | | Without the hint, we always look for a valid name using name$0, $1, $2, etc., starting from 0, and in some cases that can lead to quadratic behavior. Noticed on a testcase in the fuzzer that runs for over 24 seconds (I gave up at that point) but takes only 2 seconds with this.
* Use Names instead of indices to identify segments (#5618)Thomas Lively2023-04-041-54/+54
| | | | | | | | | | All top-level Module elements are identified and referred to by Name, but for historical reasons element and data segments were referred to by index instead. Fix this inconsistency by using Names to refer to segments from expressions that use them. Also parse and print segment names like we do for other elements. The C API is partially converted to use names instead of indices, but there are still many functions that refer to data segments by index. Finishing the conversion can be done in the future once it becomes necessary.
* Change the default type system to isorecursive (#5239)Thomas Lively2022-11-231-110/+110
| | | | | | | | | | This makes Binaryen's default type system match the WasmGC spec. Update the way type definitions without supertypes are printed to reduce the output diff for MVP tests that do not involve WasmGC. Also port some type-builder.cpp tests from test/example to test/gtest since they needed to be rewritten to work with isorecursive type anyway. A follow-on PR will remove equirecursive types completely.
* Remove (attr 0) from tag text format (#3946)Heejin Ahn2021-06-191-9/+9
| | | | | | | | This attribute is always 0 and reserved for future use. In Binayren's unofficial text format we were writing this field as `(attr 0)`, but we have recently come to the conclusion that this is not necessary. Relevant discussion: https://github.com/WebAssembly/exception-handling/pull/160#discussion_r653254680
* [EH] Replace event with tag (#3937)Heejin Ahn2021-06-181-13/+13
| | | | | | | | | | | We recently decided to change 'event' to 'tag', and to 'event section' to 'tag section', out of the rationale that the section contains a generalized tag that references a type, which may be used for something other than exceptions, and the name 'event' can be confusing in the web context. See - https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130 - https://github.com/WebAssembly/exception-handling/pull/161
* [wasm-split] Minimize names of newly created exports (#3905)Thomas Lively2021-06-011-0/+72
| | | | | | | | | wasm-split would previously use internal function names to create the external names of the functions that are newly exported from the primary module to be imported into the secondary module. When the input module contains full function names (as is commonly the case when emitting symbol maps), this caused the function names to be preserved as the export names, even when names are otherwise being stripped. To save on code size and properly anonymize functions, generate minimal export names when debuginfo is disabled instead.
* Remove Type ordering (#3793)Thomas Lively2021-05-181-1/+1
| | | | | | | | | As found in #3682, the current implementation of type ordering is not correct, and although the immediate issue would be easy to fix, I don't think the current intended comparison algorithm is correct in the first place. Rather than try to switch to using a correct algorithm (which I am not sure I know how to implement, although I have an idea) this PR removes Type ordering entirely. In places that used Type ordering with std::set or std::map because they require deterministic iteration order, this PR uses InsertOrdered{Set,Map} instead.
* Update test expectations after colliding landings (#3827)Alon Zakai2021-04-201-1/+1
|
* Fix element segment ordering in Print (#3818)Abbas Mashayekh2021-04-201-11/+11
| | | | | | | | | | | We used to print active element segments right after corresponding tables, and passive segments came after those. We didn't print internal segment names, and empty segments weren't being printed at all. This meant that there was no way for instructions to refer to those table segments after round tripping. This will fix those issues by printing segments in the order they were defined, including segment names when necessary and not omitting empty segments anymore.
* Minor wasm-split improvements (#3825)Thomas Lively2021-04-201-0/+128
| | | | | | | | | | | - Support functions appearing more than once in the table. It turns out we were assuming and asserting that functions would appear at most once, but we weren't making use of that assumption in any way. - Make TableSlotManager::getSlot take a function Name rather than a RefFunc expression to avoid allocating and leaking unnecessary expressions. - Add and use a Builder interface for building TableElementSegments to make them more similar to other module-level items.
* Reorder global definitions in Print pass (#3770)Abbas Mashayekh2021-04-021-4/+4
| | | | This is needed to make sure globals are printed before element segments, where `global.get` can appear both as offset and an expression.
* [reference-types] remove single table restriction in IR (#3517)Abbas Mashayekh2021-02-091-8/+8
| | | Adds support for modules with multiple tables. Adds a field for the table name to `CallIndirect` and updates the C/JS APIs accordingly.
* [module-splitting] Fix a crash when a function is exported twice (#3455)Thomas Lively2020-12-171-0/+36
| | | | | | | | | `ModuleSplitter::thunkExportedSecondaryFunctions` creates a thunk for each secondary function that needs to be exported from the main module. Previously, if a secondary function was exported twice, this code would try to create two thunks for it rather than just making one thunk and exporting it twice. This caused a fatal error because the second thunk had the same name as the first thunk and therefore could not be added to the module. This PR fixes the issue by creating no more than one thunk per function.
* [module-splitting] Allow splitting with non-const table offsets (#3408)Thomas Lively2020-12-011-0/+304
| | | | | | | | | | Extend the splitting logic to handle splitting modules with a single table segment with a non-const offset. In this situation the placeholder function names are interpreted as offsets from the table base global rather than absolute indices into the table. Since addition is not allowed in segment offset expressions, the secondary module's segment must start at the same place as the first table's segment. That means that some primary functions must be duplicated in the secondary segment to fill any gaps. They are exported and imported as necessary.
* Module splitting (#3317)Thomas Lively2020-11-121-0/+641
Adds the capability to programatically split a module into a primary and secondary module such that the primary module can be compiled and run before the secondary module has been instantiated. All calls to secondary functions (i.e. functions that have been split out into the secondary module) in the primary module are rewritten to be indirect calls through the table. Initially, the table slots of all secondary functions contain references to imported placeholder functions. When the secondary module is instantiated, it will automatically patch the table to insert references to the original functions. The process of module splitting involves these steps: 1. Create the new secondary module. 2. Export globals, events, tables, and memories from the primary module and import them in the secondary module. 3. Move the deferred functions from the primary to the secondary module. 4. For any secondary function exported from the primary module, export in its place a trampoline function that makes an indirect call to its placeholder function (and eventually to the original secondary function), allocating a new table slot for the placeholder if necessary. 5. Rewrite direct calls from primary functions to secondary functions to be indirect calls to their placeholder functions (and eventually to their original secondary functions), allocating new table slots for the placeholders if necessary. 6. For each primary function directly called from a secondary function, export the primary function if it is not already exported and import it into the secondary module. 7. Replace all references to secondary functions in the primary module's table segments with references to imported placeholder functions. 8. Create new active table segments in the secondary module that will replace all the placeholder function references in the table with references to their corresponding secondary functions upon instantiation. Functions can be used or referenced three ways in a WebAssembly module: they can be exported, called, or placed in a table. The above procedure introduces a layer of indirection to each of those mechanisms that removes all references to secondary functions from the primary module but restores the original program's semantics once the secondary module is instantiated. As more mechanisms that reference functions are added in the future, such as ref.func instructions, they will have to be modified to use a similar layer of indirection. The code as currently written makes a few assumptions about the module that is being split: 1. It assumes that mutable-globals is allowed. This could be worked around by introducing wrapper functions for globals and rewriting secondary code that accesses them, but now that mutable-globals is shipped on all browsers, hopefully that extra complexity won't be necessary. 2. It assumes that all table segment offsets are constants. This simplifies the generation of segments to actively patch in the secondary functions without overwriting any other table slots. This assumption could be relaxed by 1) having secondary segments re-write primary function slots as well, 2) allowing addition in segment offsets, or 3) synthesizing a start function to modify the table instead of using segments. 3. It assumes that each function appears in the table at most once. This isn't necessarily true in general or even for LLVM output after function deduplication. Relaxing this assumption would just require slightly more complex code, so it is a good candidate for a follow up PR. Future Binaryen work for this feature includes providing a command line tool exposing this functionality as well as C API, JS API, and fuzzer support. We will also want to provide a simple instrumentation pass for finding dead or late-executing functions that would be good candidates for splitting out. It would also be good to integrate that instrumentation with future function outlining work so that dead or exceptional basic blocks could be split out into a separate module.