summaryrefslogtreecommitdiff
path: root/test/example/module-splitting.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Use the standard shared memory text format (#6200)Thomas Lively2024-01-031-3/+3
| | | | | Update the legacy text parser and all tests to use the standard text format for shared memories, e.g. `(memory $m 1 1 shared)` rather than `(memory $m (shared 1 1))`. Also remove support for non-standard in-line "data" or "segment" declarations. This change makes the tests more compatible with the new text parser, which only supports the standard format.
* Preserve Function HeapTypes (#3952)Thomas Lively2021-06-301-2/+4
| | | | | | | | | When using nominal types, func.ref of two functions with identical signatures but different HeapTypes will yield different types. To preserve these semantics, Functions need to track their HeapTypes, not just their Signatures. This PR replaces the Signature field in Function with a HeapType field and adds new utility methods to make it almost as simple to update and query the function HeapType as it was to update and query the Function Signature.
* Remove (attr 0) from tag text format (#3946)Heejin Ahn2021-06-191-3/+3
| | | | | | | | This attribute is always 0 and reserved for future use. In Binayren's unofficial text format we were writing this field as `(attr 0)`, but we have recently come to the conclusion that this is not necessary. Relevant discussion: https://github.com/WebAssembly/exception-handling/pull/160#discussion_r653254680
* [EH] Replace event with tag (#3937)Heejin Ahn2021-06-181-4/+4
| | | | | | | | | | | We recently decided to change 'event' to 'tag', and to 'event section' to 'tag section', out of the rationale that the section contains a generalized tag that references a type, which may be used for something other than exceptions, and the name 'event' can be confusing in the web context. See - https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130 - https://github.com/WebAssembly/exception-handling/pull/161
* [wasm-split] Add an option to emit a placeholder map (#3931)Thomas Lively2021-06-121-2/+2
| | | | | | The new instruction emits a file containing a map between placeholder index and the name of the split out function that placeholder is replacing in the table. This map is intended to be useful for debugging, as discussed in https://github.com/emscripten-core/emscripten/issues/14330.
* [wasm-split] Minimize names of newly created exports (#3905)Thomas Lively2021-06-011-0/+44
| | | | | | | | | wasm-split would previously use internal function names to create the external names of the functions that are newly exported from the primary module to be imported into the secondary module. When the input module contains full function names (as is commonly the case when emitting symbol maps), this caused the function names to be preserved as the export names, even when names are otherwise being stripped. To save on code size and properly anonymize functions, generate minimal export names when debuginfo is disabled instead.
* Minor wasm-split improvements (#3825)Thomas Lively2021-04-201-0/+43
| | | | | | | | | | | - Support functions appearing more than once in the table. It turns out we were assuming and asserting that functions would appear at most once, but we weren't making use of that assumption in any way. - Make TableSlotManager::getSlot take a function Name rather than a RefFunc expression to avoid allocating and leaking unnecessary expressions. - Add and use a Builder interface for building TableElementSegments to make them more similar to other module-level items.
* [module-splitting] Fix a crash when a function is exported twice (#3455)Thomas Lively2020-12-171-0/+8
| | | | | | | | | `ModuleSplitter::thunkExportedSecondaryFunctions` creates a thunk for each secondary function that needs to be exported from the main module. Previously, if a secondary function was exported twice, this code would try to create two thunks for it rather than just making one thunk and exporting it twice. This caused a fatal error because the second thunk had the same name as the first thunk and therefore could not be added to the module. This PR fixes the issue by creating no more than one thunk per function.
* Refactor printing code so that printing Expressions always works (#3450)Alon Zakai2020-12-171-4/+3
| | | | | | | | This avoids needing to add include wasm-printing if a file doesn't already have it. To achieve that, add the std::ostream hooks in wasm.h, and also use them when possible, removing the need for the special WasmPrinter object. Also stop printing in "full" (print types on each line) in error messages by default. The user can still get that, as always, using BINARYEN_PRINT_FULL=1 in the env.
* [module-splitting] Allow splitting with non-const table offsets (#3408)Thomas Lively2020-12-011-0/+101
| | | | | | | | | | Extend the splitting logic to handle splitting modules with a single table segment with a non-const offset. In this situation the placeholder function names are interpreted as offsets from the table base global rather than absolute indices into the table. Since addition is not allowed in segment offset expressions, the secondary module's segment must start at the same place as the first table's segment. That means that some primary functions must be duplicated in the secondary segment to fill any gaps. They are exported and imported as necessary.
* Module splitting (#3317)Thomas Lively2020-11-121-0/+289
Adds the capability to programatically split a module into a primary and secondary module such that the primary module can be compiled and run before the secondary module has been instantiated. All calls to secondary functions (i.e. functions that have been split out into the secondary module) in the primary module are rewritten to be indirect calls through the table. Initially, the table slots of all secondary functions contain references to imported placeholder functions. When the secondary module is instantiated, it will automatically patch the table to insert references to the original functions. The process of module splitting involves these steps: 1. Create the new secondary module. 2. Export globals, events, tables, and memories from the primary module and import them in the secondary module. 3. Move the deferred functions from the primary to the secondary module. 4. For any secondary function exported from the primary module, export in its place a trampoline function that makes an indirect call to its placeholder function (and eventually to the original secondary function), allocating a new table slot for the placeholder if necessary. 5. Rewrite direct calls from primary functions to secondary functions to be indirect calls to their placeholder functions (and eventually to their original secondary functions), allocating new table slots for the placeholders if necessary. 6. For each primary function directly called from a secondary function, export the primary function if it is not already exported and import it into the secondary module. 7. Replace all references to secondary functions in the primary module's table segments with references to imported placeholder functions. 8. Create new active table segments in the secondary module that will replace all the placeholder function references in the table with references to their corresponding secondary functions upon instantiation. Functions can be used or referenced three ways in a WebAssembly module: they can be exported, called, or placed in a table. The above procedure introduces a layer of indirection to each of those mechanisms that removes all references to secondary functions from the primary module but restores the original program's semantics once the secondary module is instantiated. As more mechanisms that reference functions are added in the future, such as ref.func instructions, they will have to be modified to use a similar layer of indirection. The code as currently written makes a few assumptions about the module that is being split: 1. It assumes that mutable-globals is allowed. This could be worked around by introducing wrapper functions for globals and rewriting secondary code that accesses them, but now that mutable-globals is shipped on all browsers, hopefully that extra complexity won't be necessary. 2. It assumes that all table segment offsets are constants. This simplifies the generation of segments to actively patch in the secondary functions without overwriting any other table slots. This assumption could be relaxed by 1) having secondary segments re-write primary function slots as well, 2) allowing addition in segment offsets, or 3) synthesizing a start function to modify the table instead of using segments. 3. It assumes that each function appears in the table at most once. This isn't necessarily true in general or even for LLVM output after function deduplication. Relaxing this assumption would just require slightly more complex code, so it is a good candidate for a follow up PR. Future Binaryen work for this feature includes providing a command line tool exposing this functionality as well as C API, JS API, and fuzzer support. We will also want to provide a simple instrumentation pass for finding dead or late-executing functions that would be good candidates for splitting out. It would also be good to integrate that instrumentation with future function outlining work so that dead or exceptional basic blocks could be split out into a separate module.