| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Update Builder and IRBuilder makeStructGet and makeStructSet functions
to require the memory order to be explicitly supplied. This is slightly
more verbose, but will reduce the chances that we forget to properly
consider synchronization when implementing new features in the future.
|
|
|
|
|
|
|
|
|
|
| |
The UBSan builder started failing with an error about a misaligned store
in wasm-ctor-eval.cpp. The store was already done via `memcpy` to avoid
alignment issues, but apparently this is no longer enough. Use `void*`
as the destination type to further avoid giving the impression of
guaranteed alignment.
Also fix UB when executing std::abs on minimum negative integers in
literal.cpp.
|
|
|
| |
See https://github.com/WebAssembly/memory64/pull/92
|
|
|
| |
This allows 64-bit bounds checking to work properly.
|
|
|
|
| |
Some places assumed a 32-bit index.
|
|
|
|
|
|
|
| |
The old code manually managed it for no good reason that I can see.
After this, there is no difference between callFunction and callFunctionInternal,
so fold them together.
|
|
|
|
|
|
|
|
|
| |
To avoid having two separate topological sort utilities in the code
base, replace remaining uses of the old DFS-based, CRTP topological sort
with the newer Kahn's algorithm implementation.
This would be NFC, except that the new topological sort produces a
different order than the old topological sort, so the output of some
passes is reordered.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unlike other module elements, types are not stored on the `Module`.
Instead, they are collected by traversing the IR before printing and
binary writing. The code that collects the types tries to optimize the
order of rec groups based on the number of times each type is used. As a
result, the output order of types generally has no relation to the input
order of types. In addition, most type optimizations rewrite the types
into a single large rec group, and the order of types in that group is
essentially arbitrary. Changes to the code for counting type uses,
sorting types, or sorting rec groups can yield very large changes in the
output order of types, producing test diffs that are hard to review and
potentially harming the readability of tests by moving output types away
from the corresponding input types.
To help make test output more stable and readable, introduce a tool
option that causes the order of output types to match the order of input
types as closely as possible. It is implemented by having the parsers
record the indices of the input types on the `Module` just like they
already record the type names. The `GlobalTypeRewriter` infrastructure
used by type optimizations associates the new types with the old indices
just like it already does for names and also respects the input order
when rewriting types into a large recursion group.
By default, wasm-opt and other tools clear the recorded type indices
after parsing the module, so their default behavior is not modified by
this change.
Follow-on PRs will use the new flag in more tests, which will generate
large diffs but leave the tests in stable, more readable states that
will no longer change due to other changes to the optimizing type
sorting logic.
|
|
|
|
| |
This will allow both the old and new topological sort utilities to be
included into the same .cpp file while we phase out the old utility.
|
|
|
|
|
| |
Audit the remaining ocurrences of `== HeapType::` and fix those that did
not handle shared types correctly. Add tests for some of the fixes;
others are NFC but clarify the code.
|
|
|
|
|
| |
Also use TableInit in the interpreter to initialize module's table
state, which will now handle traps properly, fixing #6431
|
|
|
|
|
|
|
|
|
| |
PR ##6803 proposed removing Type::isString and HeapType::isString in
favor of more explicit, verbose callsites. There was no consensus to
make this change, but it was accidentally committed as part of #6804.
Revert the accidental change, except for the useful, noncontroversial
parts, such as fixing the `isString` implementation and a few other
locations to correctly handle shared types.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The HeapType API has functions like `isBasic()`, `isStruct()`,
`isSignature()`, etc. to test the classification of a heap type. Many
users have to call these functions in sequence and handle all or most of
the possible classifications. When we add a new kind of heap type,
finding and updating all these sites is a manual and error-prone
process.
To make adding new heap type kinds easier, introduce a new API that
returns an enum classifying the heap type. The enum can be used in
switch statements and the compiler's exhaustiveness checker will flag
use sites that need to be updated when we add a new kind of heap type.
This commit uses the new enum internally in the type system, but
follow-on commits will add new uses and convert uses of the existing
APIs to use `getKind` instead.
|
|
|
|
|
|
|
|
|
| |
Rename instructions `extern.internalize` into `any.convert_extern` and
`extern.externalize` into `extern.convert_any` to follow more closely
the spec. This was changed in
https://github.com/WebAssembly/gc/issues/432.
The legacy name is still accepted in text inputs and in the C and JS
APIs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The stringview types from the stringref proposal have three irregularities that
break common invariants and require pervasive special casing to handle properly:
they are supertypes of `none` but not subtypes of `any`, they cannot be the
targets of casts, and they cannot be used to construct nullable references. At
the same time, the stringref proposal has been superseded by the imported
strings proposal, which does not have these irregularities. The cost of
maintaing and improving our support for stringview types is no longer worth the
benefit of supporting them.
Simplify the code base by entirely removing the stringview types and related
instructions that do not have analogues in the imported strings proposal and do
not make sense in the absense of stringviews.
Three remaining instructions, `stringview_wtf16.get_codeunit`,
`stringview_wtf16.slice`, and `stringview_wtf16.length` take stringview operands
in the stringref proposal but cannot be removed because they lower to operations
from the imported strings proposal. These instructions are changed to take
stringref operands in Binaryen IR, and to allow a graceful upgrade path for
users of these instructions, the text and binary parsers still accept but ignore
`string.as_wtf16`, which is the instruction used to convert stringrefs to
stringviews. The binary writer emits code sequences that use scratch locals and `string.as_wtf16` to keep the output valid.
Future PRs will further align binaryen with the imported strings proposal
instead of the stringref proposal, for example by making `string` a subtype of
`extern` instead of a subtype of `any` and by removing additional instructions
that do not have analogues in the imported strings proposal.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we had passes --generate-stack-ir, --optimize-stack-ir, --print-stack-ir
that could be run like any other passes. After generating StackIR it was stashed on
the function and invalidated if we modified BinaryenIR. If it wasn't invalidated then
it was used during binary writing. This PR switches things so that we optionally
generate, optimize, and print StackIR only during binary writing. It also removes
all traces of StackIR from wasm.h - after this, StackIR is a feature of binary writing
(and printing) logic only.
This is almost NFC, but there are some minor noticeable differences:
1. We no longer print has StackIR in the text format when we see it is there. It
will not be there during normal printing, as it is only present during binary writing.
(but --print-stack-ir still works as before; as mentioned above it runs during writing).
2. --generate/optimize/print-stack-ir change from being passes to being flags that
control that behavior instead. As passes, their order on the commandline mattered,
while now it does not, and they only "globally" affect things during writing.
3. The C API changes slightly, as there is no need to pass it an option "optimize" to
the StackIR APIs. Whether we optimize is handled by --optimize-stack-ir which is
set like other optimization flags on the PassOptions object, so we don't need the
old option to those C APIs.
The main benefit here is simplifying the code, so we don't need to think about
StackIR in more places than just binary writing. That may also allow future
improvements to our usage of StackIR.
|
|
|
|
| |
precompute (#6561)
|
|
|
|
|
|
|
|
| |
The new wat parser currently considers itself to be at the end of the file
whenever it cannot lex another token. This is not quite right, but fixing it
causes parser errors because of the extra null character we were appending to
files when we read them. This null character is not useful since we can already
read files as `std::string`, which always has an implicit null character, so
remove it. Clean up some users of `read_file` while we're at it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a combined commit covering multiple PRs fixing the handling of return
calls in different areas. The PRs are all landed as a single commit to ensure
internal consistency and avoid problems with bisection.
Original PR descriptions follow:
* Fix inlining of `return_call*` (#6448)
Previously we transformed return calls in inlined function bodies into normal
calls followed by branches out to the caller code. Similarly, when inlining a
`return_call` callsite, we simply added a `return` after the body inlined at the
callsite. These transformations would have been correct if the semantics of
return calls were to call and then return, but they are not correct for the
actual semantics of returning and then calling.
The previous implementation is observably incorrect for return calls inside try
blocks, where the previous implementation would run the inlined body within the
try block, but the proper semantics would be to run the inlined body outside the
try block.
Fix the problem by transforming inlined return calls to branches followed by
calls rather than as calls followed by branches. For the case of inlined return
call callsites, insert branches out of the original body of the caller and
inline the body of the callee as a sibling of the original caller body. For the
other case of return calls appearing in inlined bodies, translate the return
calls to branches out to calls inserted as siblings of the original inlined
body.
In both cases, it would have been convenient to use multivalue block return to
send call parameters along the branches to the calls, but unfortunately in our
IR that would have required tuple-typed scratch locals to unpack the tuple of
operands at the call sites. It is simpler to just use locals to propagate the
operands in the first place.
* Fix interpretation of `return_call*` (#6451)
We previously interpreted return calls as calls followed by returns, but that is
not correct both because it grows the size of the execution stack and because it
runs the called functions in the wrong context, which can be observable in the
case of exception handling.
Update the interpreter to handle return calls correctly by adding a new
`RETURN_CALL_FLOW` that behaves like a return, but carries the arguments and
reference to the return-callee rather than normal return values.
`callFunctionInternal` is updated to intercept this flow and call return-called
functions in a loop until a function returns with some other kind of flow.
Pull in the upstream spec tests return_call.wast, return_call_indirect.wast, and
return_call_ref.wast with light editing so that we parse and validate them
successfully.
* Handle return calls in wasm-ctor-eval (#6464)
When an evaluated export ends in a return call, continue evaluating the
return-called function. This requires propagating the parameters, handling the
case that the return-called function might be an import, and fixing up local
indices in case the final function has different parameters than the original
function.
* Update effects.h to handle return calls correctly (#6470)
As far as their surrounding code is concerned return calls are no different from
normal returns. It's only from a caller's perspective that a function containing
a return call also has the effects of the return-callee. To model this more
precisely in EffectAnalyzer, stash the throw effect of return-callees on the
side and only merge it in at the end when analyzing the effects of a full
function body.
|
|
|
|
|
|
|
| |
#6244 tried to do this but was not quite right. It treated a string like an array
or a struct, which means create a global for it. But just creating a global isn't
enough, as it needs to also be sorted in the right place etc. which requires
changes in other places. But there is a much simpler solution here: string
constants are just constants, which we can emit in-line, so do that.
|
| |
|
|
|
|
|
|
|
|
| |
Two trivial places did not handle that case, and assumed an exported function
was actually defined (and not imported).
Also add some const stuff to fix compilation after this change.
This was discovered by #6026
|
|
|
|
|
|
| |
In practice we don't need high addresses, and when they happen the current
implementation can OOM, so exit early on them instead.
Fixes #5893
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A cycle of data is something we can't just naively emit as wasm globals. If
at runtime we end up, for example, with an object A that refers to itself,
then we can't just emit
(global $A
(struct.new $A
(global.get $A)))
The struct.get is of this very global, and such a self-reference is invalid. So
we need to break such cycles as we emit them. The simple idea used here
is to find paths in the cycle that are nullable and mutable, and replace the
initial value with a null that is fixed up later in the start function:
(global $A
(struct.new $A
(ref.null $A)))
(func $start
(struct.set
(global.get $A)
(global.get $A)))
)
This is not optimal in terms of breaking cycles, but it is fast (linear time)
and simple, and does well in practice on j2wasm (where cycles in fact
occur).
|
|
|
|
|
|
|
|
|
|
| |
This fixes wasm-ctor-eval on evalling a GC data structure that contains a
field initialized with an externalized value.
Per the spec this is a constant instruction and I verified that V8 allows this.
Also add missing validation in wasm-ctor-eval of the output (which makes
debugging this kind of thing a little easier).
|
| |
|
|
|
|
|
|
|
|
| |
To match the standard instruction name, rename the expression class without
changing any parsing or printing behavior. A follow-on PR will take care of the
functional side of this change while keeping support for parsing the old name.
This change will allow `ArrayInit` to be used as the expression class for the
upcoming `array.init_data` and `array.init_elem` instructions.
|
|
|
|
|
|
|
|
|
|
| |
Before, a single ctor with GC worked, but any subsequent ones simply dropped
the globals from the previous ones, because we were missing an addGlobal in
an important place.
Also, we can get confused about which global names are in use in the module, so fix
that as well by storing them directly (we keep removing and re-adding globals, so
we can't use the normal module mechanism to find which names are in use).
|
|
|
|
| |
Until we get full support for serializing table changes, stop evalling so we do
not break things.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
(#5510)
Simply loop over the values and use tuple.make.
This also adds a lit test for ctor-eval. I found that the problem blocking us
before was the logging, which confuses the update script. As this test at
least does not require that logging, this PR adds a --quiet flag that
disables the logging, and then a lit test just works.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the goal of supporting null characters (i.e. zero bytes) in strings.
Rewrite the underlying interned `IString` to store a `std::string_view` rather
than a `const char*`, reduce the number of map lookups necessary to intern a
string, and present a more immutable interface.
Most importantly, replace the `c_str()` method that returned a `const char*`
with a `toString()` method that returns a `std::string`. This new method can
correctly handle strings containing null characters. A `const char*` can still
be had by calling `data()` on the `std::string_view`, although this usage should
be discouraged.
This change is NFC in spirit, although not in practice. It does not intend to
support any particular new functionality, but it is probably now possible to use
strings containing null characters in at least some cases. At least one parser
bug is also incidentally fixed. Follow-on PRs will explicitly support and test
strings containing nulls for particular use cases.
The C API still uses `const char*` to represent strings. As strings containing
nulls become better supported by the rest of Binaryen, this will no longer be
sufficient. Updating the C and JS APIs to use pointer, length pairs is left as
future work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to
them can only possibly be null. To simplify the IR and increase type precision,
introduce new invariants that all `ref.null` instructions must be typed with one
of these new bottom types and that `Literals` have a bottom type iff they
represent null values. These new invariants requires several additional changes.
First, it is now possible that the `ref` or `target` child of a `StructGet`,
`StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom
reference type, so it is not possible to determine what heap type annotation to
emit in the binary or text formats. (The bottom types are not valid type
annotations since they do not have indices in the type section.)
To fix that problem, update the printer and binary emitter to emit unreachables
instead of the instruction with undetermined type annotation. This is a valid
transformation because the only possible value that could flow into those
instructions in that case is null, and all of those instructions trap on nulls.
That fix uncovered a latent bug in the binary parser in which new unreachables
within unreachable code were handled incorrectly. This bug was not previously
found by the fuzzer because we generally stop emitting code once we encounter an
instruction with type `unreachable`. Now, however, it is possible to emit an
`unreachable` for instructions that do not have type `unreachable` (but are
known to trap at runtime), so we will continue emitting code. See the new
test/lit/parse-double-unreachable.wast for details.
Update other miscellaneous code that creates `RefNull` expressions and null
`Literals` to maintain the new invariants as well.
|
|
|
| |
Replacing Fatal() call sites in src/shell-interface.h & src/tools/wasm-ctor-eval.cpp that were added in the Multi-Memories PR with assert()
|
|
|
|
|
|
|
| |
This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction.
It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format.
There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
|
|
|
|
|
|
|
| |
RTTs were removed from the GC spec and if they are added back in in the future,
they will be heap types rather than value types as in our implementation.
Updating our implementation to have RTTs be heap types would have been more work
than deleting them for questionable benefit since we don't know how long it will
be before they are specced again.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Updating wasm.h/cpp for DataSegments
* Updating wasm-binary.h/cpp for DataSegments
* Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal
* checking isPassive when copying data segments to know whether to construct the data segment with an offset or not
* Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp
* Updated wasm-interpreter
* First look at updating Passes
* Updated wasm-s-parser
* Updated files in src/ir
* Updating tools files
* Last pass on src files before building
* added visitDataSegment
* Fixing build errors
* Data segments need a name
* fixing var name
* ran clang-format
* Ensuring a name on DataSegment
* Ensuring more datasegments have names
* Adding explicit name support
* Fix fuzzing name
* Outputting data name in wasm binary only if explicit
* Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames
* Pass on when data segment names are explicitly set
* Ran auto_update_tests.py and check.py, success all around
* Removed an errant semi-colon and corrected a counter. Everything still passes
* Linting
* Fixing processing memory names after parsed from binary
* Updating the test from the last fix
* Correcting error comment
* Impl kripken@ comments
* Impl tlively@ comments
* Updated tests that remove data print when == 0
* Ran clang format
* Impl tlively@ comments
* Ran clang-format
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ended up simpler than I thought. We can simply emit global and
local data as we go, creating globals as necessary to contain GC data,
and referring to them using global.get later. That will ensure that
data identity works (things referring to the same object in the interpreter
will refer to the same object when the wasm is loaded). In more detail,
each live GC item is created in a "defining global", a global that is
immutable and of the precise type of that data. Then we just read from
that location in any place that wants to refer to that data. That is,
something like
function foo() {
var x = Bar(10);
var y = Bar(20);
var z = x;
z.value++; // first object now contains 11
...
}
will be evalled into something like
var define$0 = Bar(11); // note the ++ has taken effect here
var define$1 = Bar(20);
function foo() {
var x = define$0;
var y = define$1;
var z = define$0;
...
}
This PR should handle everything but "cycles", that is, GC data that at
runtime ends up forming a loop. Leaving that for later work (not sure
how urgent it is to fix).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GlobalManager is another class that added complexity in the interpreter logic,
and did not help. In fact it hurts extensibility, as when one wants to extend the
interpreter one has another class to customize, and it is templated on the main
runner, so again as #4479 we end up with annoying template cycles.
This simply removes that class. That makes the interpreter code strictly
simpler. Applying that change to wasm-ctor-eval also ends up fixing a
pre-existing bug, so this PR gets testing through that.
The ctor-eval issue was that we did not extend the GlobalManager properly
in the past: we checked for accesses on imported globals there, but not in
the main class, i.e., not on global.get operations. Needing to do things in
two places is an example of the previous complexity. The fix is simply to
implement visitGlobalGet in one place, and remove all the GlobalManager
logic added in ctor-eval, which then gets a lot simpler as well.
The new imported-global-2.wast checks for that bug (a global.get of an
import should stop us from evalling). Existing tests cover the other cases,
like it being ok to read a non-imported global, etc. The existing test
indirect-call3.wast required a slight change: There was a global.get of
an imported global, which was ignored in the place it happened (an init
of an elem segment); the new code checks all global.gets, so it now
catches that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
class (#4479)
As recently discussed, the interpreter code is way too complex. Trying to add
ctor-eval stuff I need, I got stuck and ended up spending some time to get rid
of some of the complexity.
We had a ModuleInstanceBase class which was basically an instance of a
module, that is, an execution of it. And internally we have RuntimeExpressionRunner
which is a runner that integrates with the ModuleInstanceBase - basically, it uses
the runtime info to execute code. For example, the MIB has globals info, and the
RER would read it from there.
But these two classes are really just one functionality - an execution of a module.
We get rid of some complexity by removing the separation between them, ending
up with a class that can run a module.
One set of problems we avoid is that we can now extend the single class in a
simple way. Before, we would need to extend both - and inform each other of
those changes. That gets "fun" with CRTP which we use everywhere. In other
words, each of the two classes depended on the other / would need to be
templated on the other. Specifically, MIB.callFunction would need to be given
the RER to run with, and so that would need to be templated on it. This ends up
leading to a bunch more templating all around - all complexity that we just
don't need. See the simplification to the wasm-ctor-eval for some of that (and
even worse complexity would have been needed without this PR in the next
steps for that tool to eval GC stuff).
The final single class is now called ModuleRunner.
Also fixes a pre-existing issue uncovered by this PR. We had the delegate
target on the runner, but it should be tied to a function scope. This happened
to not be a problem if one always created a new runner for each scope, but
this PR makes the runner longer-lived, so the stale data ended up mattering.
The PR moves that data to the proper place.
Note: Diff without whitespace is far, far smaller.
|
|
|
|
|
|
|
| |
LiteralList overlaps with Literals, but is less efficient as it is not a
SmallVector.
Add reserve/capacity methods to SmallVector which are now
necessary to compile.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ignoring external input, assume params have a value of 0. This
makes it possible to eval main(argc, argv) if one is careful and does
not actually use those values.
This is basically a workaround for main always receiving argc/argv,
even if the C code has no args (in that case the compiler emits
__original_main for the user's main, and wraps it with a main
that adds the args, hence the problem).
This is similar to the existing support for handling wasi_args_get
when ignoring external input, although it just sets values of zeros for
the params. Perhaps it could check for main() specifically and return
1 for argc and a proper buffer for argv somehow, but I think if a program
wants to use --ignore-external-input it can avoid actually reading
argc/argv.
|
|
|
|
| |
(#4448)
|
|
|
| |
This is necessary for e.g. main() which returns an i32.
|
|
|
|
|
|
|
|
| |
This tool depends (atm) on flattening memory segments. That is not compatible
with memory.init which cares about segment identities.
This changes flatten() only by adding the check for MemoryInit. The rest is
unchanged, although I saw the other two params are not needed and I removed
them while I was there.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default wasm-ctor-eval removes exports that it manages to completely
eval (if it just partially evals then the export remains, but points to a function
with partially-evalled contents). However, in some cases we do want to keep
the export around even so, for example during fuzzing (as the fuzzer wants
to call the same exports before and after wasm-ctor-eval runs) and also
if there is an ABI we need to preserve (like if we manage to eval all of
main()), or if the function returns a value (which we don't support yet, but
this is a PR to prepare for that).
Specifically, there is now a new option:
--kept-exports foo,bar
That is a list of exports to keep around.
Note that when we keep around an export after evalling the ctor we
make the export point to a new function. That new function just
contains a nop, so that nothing happens when it is called. But the
original function is kept around as it may have other callers, who we
do not want to modify.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This lets us eval part of a function but not all, which is necessary to handle
real-world things like __wasm_call_ctors in LLVM output, as that is the
single ctor that is exported and it has calls to the actual ctors.
To do so, we look for a toplevel block and execute its items one by one, in
a FunctionScope. If we stop in the middle, then we are performing a partial
eval. In that case, we only remove the parts of the function that we removed,
and we also serialize the locals whose values we read from the
FunctionScope.
For example, consider this:
function foo() {
return 10;
}
function __wasm_call_ctors() {
var x;
x = foo();
x++;
// We stop evalling here.
import1();
import2(x);
}
We can eval x = foo() and x++, but we must stop evalling when
we reach the first of those imports. The partially-evalled function
then looks like this:
function __wasm_call_ctors() {
var x;
x = 11;
import1();
import2(x);
}
That is, we evalled two lines of executing code and simply removed
them, and then we wrote out the value of the local at that point, and then
the rest of the code in the function is as it used to be.
|
|
|
|
|
| |
This logging is central to what this tool does, and not optional, so stdout
makes more sense I think. Also, as I'm re-integrating this on the emscripten
side, this makes it simpler.
|