| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
When IRBuilder builds an empty non-block scope such as a function body,
an if arm, a try block, etc, it needs to produce some expression to
represent the empty contents. Previously it produced a nop, but change
it to produce an empty block instead. The binary writer and printer have
special logic to elide empty blocks, so this produces smaller output.
Update J2CLOpts to recognize functions containing empty blocks as
trivial to avoid regressing one of its tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new text parser is faster and more standards compliant than the old text
parser. Enable it by default in wasm-opt and update the tests to reflect the
slightly different results it produces. Besides following the spec, the new
parser differs from the old parser in that it:
- Does not synthesize `loop` and `try` labels unnecessarily
- Synthesizes different block names in some cases
- Parses exports in a different order
- Parses `nop`s instead of empty blocks for empty control flow arms
- Does not support parsing Poppy IR
- Produces different error messages
- Cannot parse `pop` except as the first instruction inside a `catch`
|
|
|
|
|
|
|
|
| |
The new parser enforces the rule that imports must come before declarations
(except for type declarations). The old parser does not enforce this rule, so
many of our tests did not follow it. Fix them to follow that rule and fix other
invalid syntax. Also add missing finalization of Load expressions in
wasm-builder.h that was causing a test to fail under the new parser and guard
against an error case in wasm-ir-builder.cpp that used to cause a segfault.
|
|
|
|
|
|
|
|
|
| |
The new wat parser is much more strict than the legacy wat parser; the latter
accepts all sorts of things that the spec does not allow. To ease an eventual
transition to using the new wat parser by default, update the tests to use the
standard text format in many places where they previously did not. We do not yet
have a way to prevent new errors from being introduced into the test suite, but
at least there will now be many fewer errors when it comes time to make the
switch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove hardcoded paths for globals/functions/etc. in favor of general code
paths that support all the module elements uniformly. As a result of that, we
now support all parts of wasm, such as tables and element segments, that
we didn't before.
This refactoring is NFC aside from adding functionality. Note that this reduces
the size of wasm-metadce by 10% while increasing its functionality - the
benefits of writing generic code.
To support this, add some trivial generic helpers to get or iterate over module
elements using their kind in a dynamic manner. Using them might make
wasm-metadce slightly slower, but I can't measure any difference.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoid adding suffixes when we don't need them to keep names unique.
As background, the suffixes are not used by emcc at all, so they are just
for internal use in the tool. How that works is that metadce gets as input
the list of things the user cares about, with names for them, so it knows
the proper names to give imports and exports, and makes up names for
other things. Those made up names will not be read by the user, so we
can make them prettier as this PR does without breaking anything.
The main benefit of this PR is to make debugging easier.
|
|
|
| |
That optimization uncovered some LLVM and Binaryen bugs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we export a function that just calls another function, we can export that one
instead. Then the one in the middle may be unused,
function foo() {
return bar();
}
export foo; // can be an export of bar
This saves a few bytes in rare cases, but probably more important is that it saves
the trampoline, so if this is on a hot path, we save a call.
Context: emscripten-core/emscripten#20478 (comment)
In general this is not needed as inlining helps us out by inlining foo() into the
caller (since foo is tiny, that always ends up happening). But exports are a case
the inliner cannot handle, so we do it here.
|
|
|
|
|
|
|
|
|
|
|
| |
This PR is part of a series that adds basic support for the typed continuations proposal.
This PR relaxes the restriction that tags must not have results , only params. Tags with
results must not be used for exception handling and are only allowed if the typed
continuations feature is enabled.
As a minor point, this PR also changes the printing of tags without params: To make the
presentation consistent, (param) is omitted when printing a tag.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When printing Binaryen IR, we previously generated names for unnamed heap types
based on their structure. This was useful for seeing the structure of simple
types at a glance without having to separately go look up their definitions, but
it also had two problems:
1. The same name could be generated for multiple types. The generated names did
not take into account rec group structure or finality, so types that differed
only in these properties would have the same name. Also, generated type names
were limited in length, so very large types that shared only some structure
could also end up with the same names. Using the same name for multiple types
produces incorrect and unparsable output.
2. The generated names were not useful beyond the most trivial examples. Even
with length limits, names for nontrivial types were extremely long and visually
noisy, which made reading disassembled real-world code more challenging.
Fix these problems by emitting simple indexed names for unnamed heap types
instead. This regresses readability for very simple examples, but the trade off
is worth it.
This change also reduces the number of type printing systems we have by one.
Previously we had the system in Print.cpp, but we had another, more general and
extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the
old type printing system from Print.cpp and replace it with a much smaller use
of the new system. This requires significant refactoring of Print.cpp so that
PrintExpressionContents object now holds a reference to a parent
PrintSExpression object that holds the type name state.
This diff is very large because almost every test output changed slightly. To
minimize the diff and ease review, change the type printer in wasm-type.cpp to
behave the same as the old type printer in Print.cpp except for the differences
in name generation. These changes will be reverted in much smaller PRs in the
future to generally improve how types are printed.
|
|
|
|
| |
(#5791)
|
|
|
|
| |
The function type should be printed there just like for non-imported
functions.
|
|
|
|
|
|
|
|
|
|
| |
All top-level Module elements are identified and referred to by Name, but for
historical reasons element and data segments were referred to by index instead.
Fix this inconsistency by using Names to refer to segments from expressions that
use them. Also parse and print segment names like we do for other elements.
The C API is partially converted to use names instead of indices, but there are
still many functions that refer to data segments by index. Finishing the
conversion can be done in the future once it becomes necessary.
|
|
|
|
|
|
|
|
|
|
| |
Previously we treated global.get as a constant expression and only
additionally verified that the target globals were immutable in some cases. But
global.get of a mutable global is never a constant expression, and further,
only imported globals are available in constant expressions unless GC is
enabled.
Fix constant expression validation to only allow global.get of immutable,
imported globals, and fix all the invalid tests.
|
|
|
|
|
|
|
|
|
|
| |
This makes Binaryen's default type system match the WasmGC spec.
Update the way type definitions without supertypes are printed to reduce the
output diff for MVP tests that do not involve WasmGC. Also port some
type-builder.cpp tests from test/example to test/gtest since they needed to be
rewritten to work with isorecursive type anyway.
A follow-on PR will remove equirecursive types completely.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Updating wasm.h/cpp for DataSegments
* Updating wasm-binary.h/cpp for DataSegments
* Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal
* checking isPassive when copying data segments to know whether to construct the data segment with an offset or not
* Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp
* Updated wasm-interpreter
* First look at updating Passes
* Updated wasm-s-parser
* Updated files in src/ir
* Updating tools files
* Last pass on src files before building
* added visitDataSegment
* Fixing build errors
* Data segments need a name
* fixing var name
* ran clang-format
* Ensuring a name on DataSegment
* Ensuring more datasegments have names
* Adding explicit name support
* Fix fuzzing name
* Outputting data name in wasm binary only if explicit
* Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames
* Pass on when data segment names are explicitly set
* Ran auto_update_tests.py and check.py, success all around
* Removed an errant semi-colon and corrected a counter. Everything still passes
* Linting
* Fixing processing memory names after parsed from binary
* Updating the test from the last fix
* Correcting error comment
* Impl kripken@ comments
* Impl tlively@ comments
* Updated tests that remove data print when == 0
* Ran clang format
* Impl tlively@ comments
* Ran clang-format
|
|
|
|
|
|
| |
This adds support for tag-using instructions (`throw` and `catch`) to
wasm-metadce. We had to use a hacky workaround in
emscripten-core/emscripten#15266 because of the lack of this support;
after this lands we can remove it.
|
|
|
|
|
|
|
|
| |
This attribute is always 0 and reserved for future use. In Binayren's
unofficial text format we were writing this field as `(attr 0)`, but we
have recently come to the conclusion that this is not necessary.
Relevant discussion:
https://github.com/WebAssembly/exception-handling/pull/160#discussion_r653254680
|
|
|
|
|
|
|
|
|
|
|
| |
We recently decided to change 'event' to 'tag', and to 'event section'
to 'tag section', out of the rationale that the section contains a
generalized tag that references a type, which may be used for something
other than exceptions, and the name 'event' can be confusing in the web
context.
See
- https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130
- https://github.com/WebAssembly/exception-handling/pull/161
|
|
|
|
|
|
|
|
|
| |
As found in #3682, the current implementation of type ordering is not correct,
and although the immediate issue would be easy to fix, I don't think the current
intended comparison algorithm is correct in the first place. Rather than try to
switch to using a correct algorithm (which I am not sure I know how to
implement, although I have an idea) this PR removes Type ordering entirely. In
places that used Type ordering with std::set or std::map because they require
deterministic iteration order, this PR uses InsertOrdered{Set,Map} instead.
|
|
|
|
|
|
|
|
|
|
|
| |
We used to print active element segments right after corresponding
tables, and passive segments came after those. We didn't print internal
segment names, and empty segments weren't being printed at all. This
meant that there was no way for instructions to refer to those table
segments after round tripping.
This will fix those issues by printing segments in the order they were
defined, including segment names when necessary and not omitting
empty segments anymore.
|
|
|
|
|
|
|
|
| |
The passive keyword has been removed from spec's text format, and now
any data segment that doesn't have an offset is considered as passive.
This PR remove that from both parser and the Print pass, plus all tests
that used that syntax.
Fixes #2339
|
| |
|
|
|
| |
SExpressionWasmBuilder was not applying default memory and table import names on the memory and table, unlike on functions, globals and events, where it applies them. Also aligns default import names to use the same shorter forms as in binary parsing.
|
|
|
|
|
|
|
|
| |
`BinaryIndexes` was only used in two places (Print.cpp and
wasm-binary.h), so it didn't seem to be a great fit for
module-utils.h. This change moves it to wasm-binary.h and removes its
usage in Print.cpp. This means that function indexes are no longer
printed, but those were of limited utility and were the source of
annoying noise when updating tests, anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.
Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.
The C and JS APIs and their tests also had to be updated to remove
named function types.
|
|
|
|
|
|
|
|
|
| |
This is the start of a larger refactoring to remove FunctionType entirely and
store types and signatures directly on the entities that use them. This PR
updates BrOnExn and Events to remove their use of FunctionType and makes the
BinaryWriter traverse the module and collect types rather than using the global
FunctionType list. While we are collecting types, we also sort them by frequency
as an optimization. Remaining uses of FunctionType in Function, CallIndirect,
and parsing will be removed in a future PR.
|
|
|
|
|
|
| |
Sometimes wasm-metadce is the last tool to run over a binary in
Emscripten, and in that case it needs to know what features are
enabled in order to emit a valid binary. For example it needs to know
whether to emit a data count section.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the event and the event section, as specified in
https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md#changes-to-the-binary-model.
Wasm events are features that suspend the current execution and transfer
the control flow to a corresponding handler. Currently the only
supported event kind is exceptions.
For events, this includes support for
- Binary file reading/writing
- Wast file reading/writing
- Binaryen.js API
- Fuzzer
- Validation
- Metadce
- Passes: metrics, minify-imports-and-exports,
remove-unused-module-elements
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Refactored & fixed typeuse parsing rules so now the rules more closely
follow the spec. There have been multiple parsing rules that were
different in subtle ways, which are supposed to be the same according
to the spec.
- Duplicate types, i.e., types with the same signature, in the type
section are allowed as long as they don't have the same given name.
If a name is given, we use it; if type name is not given, we
generate one in the form of `$FUNCSIG$` + signature string. If the
same generated name already exists in the type section, we append
`_` at the end. This causes most of the changes in the autogenerated
type names in test outputs.
- A typeuse has to be in the order of (type) -> (param) -> (result),
if more than one of them exist. In case of function definitions,
(local) has to be after all of these. Fixed some test cases that
violate this rule.
- When only (param)/(result) are given, its type will be the type with
the smallest existing type index whose parameter and result are the
same. If there's no such type, a new type will be created and
inserted.
- Added a test case `duplicate_types.wast` to test type namings for
duplicate types.
- Refactored `parseFunction` function.
- Add more overrides to helper functions: `getSig` and
`ensureFunctionType`.
|
|
|
|
|
|
| |
Automated renaming according to
https://github.com/WebAssembly/spec/issues/884#issuecomment-426433329.
|
|
|
|
|
| |
That is the correct order in the text format, wabt errors otherwise.
See AssemblyScript/assemblyscript#310
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #1649
This moves us to a single object for functions, which can be imported or nor, and likewise for globals (as a result, GetGlobals do not need to check if the global is imported or not, etc.). All imported things now inherit from Importable, which has the module and base of the import, and if they are set then it is an import.
For convenient iteration, there are a few helpers like
ModuleUtils::iterDefinedGlobals(wasm, [&](Global* global) {
.. use global ..
});
as often iteration only cares about imported or defined (non-imported) things.
|
|
|
| |
We can remove the memory/table (itself, or an import if imported) if they are not used. This is pretty minor on a large wasm file, but when reading small wasts it's very noticeable to have an unused memory and table all the time.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* ignore missing imports (the wasm may have already had them optimized out)
* handle segments that hold on to globals (root them, for now, as we can't remove segments)
* run reorder-functions, as the optimal order may have changed after we dce
* fix global, global init, and segment offset reachability
* fix import rooting and processing - imports may be imported more than once
|
|
This adds a new tool for better dead code elimination. The problem this helps overcome is when the wasm module is part of something larger, like a wasm+JS combination, and therefore doing DCE in either one is not sufficient as it can't remove a cycle spanning the wasm and JS worlds. Concretely, when binaryen performs DCE by itself, it can never remove an export, because it considers those roots - but in the larger ("meta") space outside, they may actually be removable.
To solve that, this tool receives a description of the outside graph (in very abstract form), including which nodes are roots. It then adds to that graph nodes from the wasm, so that we have a single graph representing the entire space (the outside + wasm + connections between them). It then performs DCE, finding what is not reachable from the roots, and cleaning it up from the wasm. It of course can't clean up things from the outside, since all it has is the abstract representation of those things in the graph, but it prints out the ids of the removable nodes, which an outside tool can use.
This tool is written in as general a way as possible, hopefully it can have multiple uses. The use I have in mind is to write something in emscripten that uses this to DCE the JS+wasm combination that we emit.
|