summaryrefslogtreecommitdiff
path: root/src/tools/wasm-split/wasm-split.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [wasm-split] Configure split functions rather than kept functions (#6949)Thomas Lively2024-09-171-8/+3
| | | | | | | | The configuration for the module splitting utility previous took a set of functions to keep in the primary module. Change it to take a list of functions to split into the secondary module instead. This improves the code quality in multi-split mode because it keeps stub functions generated by previous splits from being moved into secondary modules during later splits.
* [wasm-split] Simplify handling of --keep-funcs and --split-funcs (#6948)Thomas Lively2024-09-171-67/+63
| | | | | | | | | | | | Maintain the invariant that every defined functions belongs to either the set of kept functions or the set of split functions. Functions are kept by default except when --keep-funcs is specified without --split-funcs on the command line. This is mostly NFC except that it changes the default behavior when no arguments are specified on the command line to keep all functions. This will simplify a follow-on PR that switches from passing the kept functions to the module splitting utility to passing the split functions.
* [wasm-split] Run RemoveUnusedElements on secondary modules (#6945)Thomas Lively2024-09-171-5/+6
| | | | | | | | | Rather than analyze what module elements from the primary module a secondary module will need, the splitting logic conservatively imports all module elements from the primary module into the secondary module. Run RemoveUnusedElements on the secondary module to remove any of these imports that happen to be unnecessary. Leave a TODO mentioning the possibility of being more selective about which module elements get exported to reduce code size in the primary module, too.
* [wasm-split] Add a multi-split mode (#6943)Thomas Lively2024-09-161-0/+84
| | | | | | | Add a mode that splits a module into arbitrarily many parts based on a simple manifest file. This is currently implemented by splitting out one module at a time in a loop, but this could change in the future if splitting out all the modules at once would improve the quality of the output.
* [wasm-split] Add an option to skip importing placeholders (#6942)Thomas Lively2024-09-161-0/+1
| | | | | | | | | | | | | | Wasm-split generally assumes that calls to secondary functions made before the secondary module has been loaded and instantiated should go to imported placeholder functions that can be responsible for loading the secondary module and forwarding the call to the loaded function. That scheme makes the loading entirely transparent from the application's point of view, which is not always a good thing. Other schemes would make it impossible for a secondary function to be called before the secondary module has been explicitly loaded, in which case the placeholder functions would never be called. To improve code size and simplify instantiation under these schemes, add a new `--no-placeholders` option that skips adding imported placeholder functions.
* Add a --preserve-type-order option (#6916)Thomas Lively2024-09-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike other module elements, types are not stored on the `Module`. Instead, they are collected by traversing the IR before printing and binary writing. The code that collects the types tries to optimize the order of rec groups based on the number of times each type is used. As a result, the output order of types generally has no relation to the input order of types. In addition, most type optimizations rewrite the types into a single large rec group, and the order of types in that group is essentially arbitrary. Changes to the code for counting type uses, sorting types, or sorting rec groups can yield very large changes in the output order of types, producing test diffs that are hard to review and potentially harming the readability of tests by moving output types away from the corresponding input types. To help make test output more stable and readable, introduce a tool option that causes the order of output types to match the order of input types as closely as possible. It is implemented by having the parsers record the indices of the input types on the `Module` just like they already record the type names. The `GlobalTypeRewriter` infrastructure used by type optimizations associates the new types with the old indices just like it already does for names and also respects the input order when rewriting types into a large recursion group. By default, wasm-opt and other tools clear the recorded type indices after parsing the module, so their default behavior is not modified by this change. Follow-on PRs will use the new flag in more tests, which will generate large diffs but leave the tests in stable, more readable states that will no longer change due to other changes to the optimizing type sorting logic.
* Allow different arguments for multiple instances of a pass (#6687)Christian Speckner2024-07-151-2/+1
| | | | | | | | | | | | Each pass instance can now store an argument for it, which can be different. This may be a breaking change for the corner case of running a pass multiple times and setting the pass's argument multiple times as well (before, the last pass argument affected them all; now, it affects the last instance only). This only affects arguments with the name of a pass; others remain global, as before (and multiple passes can read them, in fact). See the CHANGELOG for details. Fixes #6646
* Allow --keepfuncs and --splitfuncs to be use alongside a profile data (#6322)Benjamin Ling2024-07-101-10/+34
| | | | | | | | | There are times after collecting a profile, we wish to manually include specific functions into the primary module. It could be due to non-deterministic profiling or functions for error scenarios (e.g. _trap). This PR helps to unlock this workflow by honoring both the `--keep-funcs` flag as well as the `--profile` flag
* [StackIR] Run StackIR during binary writing and not as a pass (#6568)Alon Zakai2024-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously we had passes --generate-stack-ir, --optimize-stack-ir, --print-stack-ir that could be run like any other passes. After generating StackIR it was stashed on the function and invalidated if we modified BinaryenIR. If it wasn't invalidated then it was used during binary writing. This PR switches things so that we optionally generate, optimize, and print StackIR only during binary writing. It also removes all traces of StackIR from wasm.h - after this, StackIR is a feature of binary writing (and printing) logic only. This is almost NFC, but there are some minor noticeable differences: 1. We no longer print has StackIR in the text format when we see it is there. It will not be there during normal printing, as it is only present during binary writing. (but --print-stack-ir still works as before; as mentioned above it runs during writing). 2. --generate/optimize/print-stack-ir change from being passes to being flags that control that behavior instead. As passes, their order on the commandline mattered, while now it does not, and they only "globally" affect things during writing. 3. The C API changes slightly, as there is no need to pass it an option "optimize" to the StackIR APIs. Whether we optimize is handled by --optimize-stack-ir which is set like other optimization flags on the PassOptions object, so we don't need the old option to those C APIs. The main benefit here is simplifying the code, so we don't need to think about StackIR in more places than just binary writing. That may also allow future improvements to our usage of StackIR.
* Support using JSPI to load the secondary wasm split module. (#5431)Brendan Dahl2023-01-201-10/+6
| | | | | | | | | | When using JSPI with wasm-split, any calls to secondary module functions will now first check a global to see if the module is loaded. If not loaded it will call a JSPI'ed function that will handle loading module. The setup is split into the JSPI pass and wasm-split tool since the JSPI pass is first run by emscripten and we need to JSPI'ify the load secondary module function. wasm-split then injects all the checks and calls to the load function.
* [wasm-split] Improve the error message for bad checksums (#5268)Thomas Lively2022-11-161-2/+2
| | | | The previous error message was ambiguous and could easily be interpreted to mean the opposite of what it meant.
* Make `Name` a pointer, length pair (#5122)Thomas Lively2022-10-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char*`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char*` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char*` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char*` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
* Refactor interaction between Pass and PassRunner (#5093)Thomas Lively2022-09-301-2/+4
| | | | | | | | | | | | | | Previously only WalkerPasses had access to the `getPassRunner` and `getPassOptions` methods. Move those methods to `Pass` so all passes can use them. As a result, the `PassRunner` passed to `Pass::run` and `Pass::runOnFunction` is no longer necessary, so remove it. Also update `Pass::create` to return a unique_ptr, which is more efficient than having it return a raw pointer only to have the `PassRunner` wrap that raw pointer in a `unique_ptr`. Delete the unused template `PassRunner::getLast()`, which looks like it was intended to enable retrieving previous analyses and has been in the code base since 2015 but is not implemented anywhere.
* Multi-Memories wasm-split (#4977)Ashley Nelson2022-09-151-1/+10
| | | Adds an --in-secondary-memory switch to the wasm-split tool that allows profile data to be stored in a separate memory from module main memory. With this option, users do not need to reserve the initial memory region for profile data and the data can be shared between multiple threads.
* [wasm-split] Add --print-profile option (#4771)sps-gold2022-07-251-18/+91
| | | | | | | | | | | | | | | | | | | | | | | There are several reasons why a function may not be trained in deterministically. So to perform quick validation we need to inspect profile.data (another ways requires split to be performed). However as profile.data is a binary file and is not self sufficient, so we cannot currently use it to perform such validation. Therefore to allow quick check on whether a particular function has been trained in, we need to dump profile.data in a more readable format. This PR, allows us to output, the list of functions to be kept (in main wasm) and those split functions (to be moved to deferred.wasm) in a readable format, to console. Added a new option `--print-profile` - input path to orig.wasm (its the original wasm file that will be used later during split) - input path to profile.data that we need to output optionally pass `--unescape` to unescape the function names Usage: ``` binaryen\build>bin\wasm-split.exe test\profile_data\MY.orig.wasm --print-profile=test\profile_data\profile.data > test\profile_data\out.log ``` note: meaning of prefixes `+` => fn to be kept in main wasm `-` => fn to be split and moved to deferred wasm
* [wasm-split] Add an --asyncify option (#4513)Thomas Lively2022-02-091-0/+10
| | | | | | | Add an option for running the asyncify transformation on the primary module emitted by wasm-split. The idea is that the placeholder functions should be able to unwind the stack while the secondary module is asynchronously loaded, then once the placeholder functions have been patched out by the secondary module the stack should be rewound and end up in the correct secondary function.
* Modernize code to C++17 (#3104)Max Graey2021-11-221-2/+2
|
* [wasm-split] Disallow mixing --profile, --keep-funcs, and --split-funcs (#4187)Thomas Lively2021-09-241-21/+24
| | | | | | | | | | | | | Previously the set of functions to keep was initially empty, then the profile added new functions to keep, then the --keep-funcs functions were added, then the --split-funcs functions were removed. This method of composing these different options was arbitrary and not necessarily intuitive, and it prevented reasonable workflows from working. For example, providing only a --split-funcs list would result in all functions being split out not matter which functions were listed. To make the behavior of these options, and --split-funcs in particular, more intuitive, disallow mixing them and when --split-funcs is used, split out only the listed functions.
* [wasm-split] Add an option for recording profile data in memory (#4120)Thomas Lively2021-09-031-1/+1
| | | | | | | | | | | | | | | | To avoid requiring a static memory allocation, wasm-split's instrumentation defaults to recording profile data in Wasm globals. This causes problems for multithreaded applications because the globals are thread-local, but it is not always feasible to arrange for a separate profile to be dumped on each thread. To simplify the profiling of such multithreaded applications, add a new instrumentation mode that stores the profiling data in shared memory instead of in globals. This allows a single profile to be written that correctly reflects the called functions on all threads. This new mode is not on by default because it requires users to ensure that the program will not trample the in-memory profiling data. The data is stored beginning at address zero and occupies one byte per declared function in the instrumented module. Emscripten can be told to leave this memory free using the GLOBAL_BASE option.
* [NFC] Split wasm-split into multiple files (#4119)Thomas Lively2021-09-031-0/+396
As wasm-split has gained new functionality, its implementation file has become large. In preparation for adding even more functionality, split the existing implementation across multiple files in a new tools/wasm-split subdirectory.