summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* [Strings] Fix StringSlice end computation (#6414)Alon Zakai2024-03-211-3/+2
| | | | | Like JS string slicing, if the end index is out of bounds that is fine, we clamp to the end. This also matches the behavior in V8 and the spec.
* Revert "Strings: Disable precomputing for now (#6412)" (#6413)Alon Zakai2024-03-201-30/+0
| | | | | | | | This reverts commit 70ac213fce134840609190a5d3a18118a089ba8a. Reverts #6412 On second thought we found a way to make fixing this less urgent, and the code size downsides of this are worrying, so let's revert it.
* Strings: Disable precomputing for now (#6412)Alon Zakai2024-03-201-0/+30
| | | | Our UTF implementation is still not fully stable it seems as we have reports of issues. Disable it for now.
* [Strings] Avoid mishandling unicode in StringConcat (#6411)Roberto Lublinerman2024-03-191-0/+5
|
* Atomics: Handle timeouts in waits in the (single-threaded) interpreter (#6408)Alon Zakai2024-03-191-3/+9
| | | | | | | | | The interpreter does not run multiple threads, and it was returning 0 from atomic.wait, which means it was woken up. But it is more correct for it to return 2, which means it timed out - which is actually the case, as no other thread exists that can wake it up. However, even that is not good for fuzzing as the timeout may be infinite or large, so just emit a host limit error on any timeout for now, until we actually implement threads.
* [Strings] Implement stringview_wtf16.slice (#6404)Alon Zakai2024-03-191-4/+49
|
* Typed continuations: suspend instructions (#6393)Frank Emrich2024-03-1925-6/+202
| | | | | | | | | | | | | | | | | | | | | This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `suspend` instruction for suspending with a given tag, documented [here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). These instructions are of the form `(suspend $tag)`. Assuming that `$tag` is defined with _n_ `param` types `t_1` to `t_n`, the instruction consumes _n_ arguments of types `t_1` to `t_n`. Its result type is the same as the `result` type of the tag. Thus, the folded textual representation looks like `(suspend $tag arg1 ... argn)`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction. This PR also fixes finalization of `cont.new`, `cont.bind` and `resume` nodes in those cases where any of their children are unreachable.
* [Strings] Avoid mishandling unicode in interpreter (#6405)Thomas Lively2024-03-181-0/+34
| | | | | | | Our interpreter implementations of `stringview_wtf16.length`, `stringview_wtf16.get_codeunit`, and `string.encode_wtf16_array` are not unicode-aware, so they were previously incorrect in the face of multi-byte code units. As a fix, bail out of the interpretation if there is a non-ascii code point that would make our naive implementation incorrect.
* [NFC] Fix build error on RISC-V 64 (#6410)moui02024-03-181-1/+11
| | | | | | | | | | | | | | | | | | | | | | Similar issue as: #6330 FAILED: src/passes/CMakeFiles/passes.dir/Precompute.cpp.o /usr/bin/c++ -I/build/binaryen/src/binaryen-version_117/src -I/build/binaryen/src/binaryen-version_117/third_party/llvm-project/include -I/build/binaryen/src/binaryen-version_117/build -march=rv64gc -mabi=lp64d -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -Wp,-D_GLIBCXX_ASSERTIONS -g -ffile-prefix-map=/build/binaryen/src=/usr/src/debug/binaryen -DBUILD_LLVM_DWARF -Wall -Werror -Wextra -Wno-unused-parameter -Wno-dangling-pointer -fno-omit-frame-pointer -fno-rtti -Wno-implicit-int-float-conversion -Wno-unknown-warning-option -Wswitch -Wimplicit-fallthrough -Wnon-virtual-dtor -fPIC -fdiagnostics-color=always -O3 -DNDEBUG -UNDEBUG -std=c++17 -MD -MT src/passes/CMakeFiles/passes.dir/Precompute.cpp.o -MF src/passes/CMakeFiles/passes.dir/Precompute.cpp.o.d -o src/passes/CMakeFiles/passes.dir/Precompute.cpp.o -c /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp In file included from /build/binaryen/src/binaryen-version_117/src/wasm-traversal.h:30, from /build/binaryen/src/binaryen-version_117/src/pass.h:24, from /build/binaryen/src/binaryen-version_117/src/ir/intrinsics.h:20, from /build/binaryen/src/binaryen-version_117/src/ir/effects.h:20, from /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp:30: In copy constructor ‘wasm::SmallVector<wasm::Expression*, 10>::SmallVector(const wasm::SmallVector<wasm::Expression*, 10>&)’, inlined from ‘constexpr std::pair<_T1, _T2>::pair(const _T1&, const _T2&) [with _U1 = wasm::Select* const; _U2 = wasm::SmallVector<wasm::Expression*, 10>; typename std::enable_if<(std::_PCC<true, _T1, _T2>::_ConstructiblePair<_U1, _U2>() && std::_PCC<true, _T1, _T2>::_ImplicitlyConvertiblePair<_U1, _U2>()), bool>::type <anonymous> = true; _T1 = wasm::Select* const; _T2 = wasm::SmallVector<wasm::Expression*, 10>]’ at /usr/include/c++/13.2.1/bits/stl_pair.h:559:21, inlined from ‘T& wasm::InsertOrderedMap<Key, T>::operator[](const Key&) [with Key = wasm::Select*; T = wasm::SmallVector<wasm::Expression*, 10>]’ at /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h:112:29: /build/binaryen/src/binaryen-version_117/src/support/small_vector.h:42:38: error: ‘<unnamed>.wasm::SmallVector<wasm::Expression*, 10>::fixed’ is used uninitialized [-Werror=uninitialized] 42 | template<typename T, size_t N> class SmallVector { | ^~~~~~~~~~~ In file included from /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp:38: /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h: In function ‘T& wasm::InsertOrderedMap<Key, T>::operator[](const Key&) [with Key = wasm::Select*; T = wasm::SmallVector<wasm::Expression*, 10>]’: /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h:112:29: note: ‘<anonymous>’ declared here 112 | std::pair<const Key, T> kv = {k, {}}; | ^~
* DeadArgumentElimination/SignaturePruning: Prune params even if called with ↵Alon Zakai2024-03-184-62/+277
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | effects (#6395) Before this PR, when we saw a param was unused we sometimes could not remove it. For example, if there was one call like this: (call $target (call $other) ) That nested call has effects, so we can't just remove it from the outer call - we'd need to move it first. That motion was hard to integrate which was why it was left out, but it turns out that is sometimes very important. E.g. in Java it is common to have such calls that send the this parameter as the result of another call; not being able to remove such params meant we kept those nested calls alive, creating empty structs just to have something to send there. To fix this, this builds on top of #6394 which makes it easier to move all children out of a parent, leaving only nested things that can be easily moved around and removed. In more detail, DeadArgumentElimination/SignaturePruning track whether we run into effects that prevent removing a field. If we do, then we queue an operation to move the children out, which we do using a new utility ParamUtils::localizeCallsTo. The pass then does another iteration after that operation. Alternatively we could try to move things around immediately, but that is quite hard: those passes already track a lot of state. It is simpler to do the fixup in an entirely separate utility. That does come at the cost of the utility doing another pass on the module and the pass itself running another iteration, but this situation is not the most common.
* [Strings] Implement string.concat in the interpreter (#6403)Roberto Lublinerman2024-03-151-1/+30
|
* [Strings] Implement string.encode_wtf16_array (#6402)Alon Zakai2024-03-141-1/+37
|
* [Strings] Fix precomputing of StringEq (#6401)Alon Zakai2024-03-141-35/+20
| | | | | | | | We incorrectly overrode the string operations in the interpreter's subclasses. But string operations can be implemented in the topmost class there (as they depend on no module state), so just implement them there, once, in a proper way. This fixes StringEq by removing its override, and moves the others to the right place.
* [NFC] Refactor ChildLocalizer to handle unreachable code better (#6394)Alon Zakai2024-03-142-24/+87
| | | | | | | | | | | | | | | | This is NFC in the current users, but is necessary functionality for a later PR. ChildLocalizer moves children into locals as needed. It used to stop when it saw the first unreachable. After this change we move such unreachable children out of the parent as well, making this more uniform: all interacting effects are moved out, and all that is left nested in the parent can be moved around and removed as desired. Also add a getReplacement helper that makes using this easier. This cannot be tested comprehensively with the current user as that user will not call this code path on an unreachable parent at all, so this just adds what can be tested. The later PR will have tests for all corner cases.
* DCE: Fix old EH on a pop that gets moved in a catch body (#6400)Alon Zakai2024-03-141-7/+25
|
* Fix a build error when assertions are disabled (#6397)Thomas Lively2024-03-131-2/+4
| | | | | Add `[[maybe_unused]]` to variables that are only used in assertions. In builds without assertions enabled, these were causing compiler errors about unused variables.
* Remove legacy GC encodings (#5874)Thomas Lively2024-03-121-151/+30
| | | | | It was previously possible to opt in to using the legacy GC opcodes with a build time flag. Now that WasmGC has shipped and users have migrated to the standard opcodes, remove the option to use the legacy encodings.
* Check for unreachable in `Select::finalize(Type)` (#6389)Thomas Lively2024-03-081-1/+9
| | | | Previously selects finalized with explicit types would never be marked unreachable, even when they should have been.
* [NFC] Clean up the unreachable replacement code in Print.cpp (#6388)Thomas Lively2024-03-081-108/+56
| | | | | | | When instructions cannot be printed because the children from which they are supposed to get their type immediates are unreachable or null, we print blocks of their dropped children followed by unreachables. But the logic for making this happen was more complicated than necessary and in fact included dead code. Clean it up.
* Fix printing of bulk array ops (#6387)Thomas Lively2024-03-081-0/+19
| | | | | | | | | When the bulk array ops had unreachable or null array types, they were replaced with blocks, but not using the correct code that also prints all their children as dropped followed by an unreachable. This meant that the text output in those cases did not parse as a valid module. Fix the bug. A follow-up PR will simplify the code to prevent similar bugs from occurring in the future.
* [IRBuilder] Validate tuple arities (#6384)Thomas Lively2024-03-071-0/+12
| | | | Throw errors if tuple arity immediates are less than 2 or if tuple index immediates are out of bounds.
* Expose features option in C API binary reading (#6380)Surma2024-03-072-4/+13
| | | | This allows reading a module that requires a particular feature set. The old API assumed only MVP features.
* Handle extended const segment offsets in the fuzzer (#6382)Thomas Lively2024-03-071-13/+13
| | | | | | The fuzzer already had logic to remove all references to non-imported globals from global initializers and data segment offsets, but it was missing for element segment offsets. Add it, and also add a missing check line for the new test that uncovered this bug as initial fuzzer input.
* Fix EH fuzz bugs (#6381)Thomas Lively2024-03-072-2/+2
| | | | | Due to a typo, the fuzzer was making externrefs when it should have been making exnrefs. Fix that and also let eh-utils.cpp know that TryTable exists to avoid an assertion failure.
* Print '(offset ...)` in data and element segments (#6379)Thomas Lively2024-03-061-0/+15
| | | | | | | Previously we just printed the offset instruction(s) directly, which is a valid shorthand only when there is a single instruction. In the case of extended constant instructions, there can potentially be multiple instructions, in which case the explicit `offset` clause is required. Print the full clause when necessary.
* Add sourcemap support to wasm-metadce and wasm-merge (#6372)Jérôme Vouillon2024-03-066-12/+148
|
* [Parser] Improve parsed IR for multivalue returns (#6378)Thomas Lively2024-03-051-10/+2
| | | | | Rather than reassembling a tuple from multiple pops, let the pop implementation assemble the tuple. This produces less code in cases where there is already a tuple of the proper size on top of the stack. It also simplifies the code.
* [Parser] Propagate debug locations like the old parser (#6377)Thomas Lively2024-03-051-0/+55
| | | | | | | | | Add a pass that propagates debug locations to unannotated child and sibling expressions after parsing. The new parser on its own only attaches debug locations to directly annotated instructions, but this pass, which we run unconditionally, emulates the behavior of the previous parser for compatibility with existing programs. It does unintuitive things to programs using the non-nested format because it runs on nested Binaryen IR, so we may want to rethink this at some point.
* [Parser] Support prologue and epilogue sourcemap annotations (#6370)Thomas Lively2024-03-045-32/+76
| | | | | | | and fix a bug with sourcemap annotations on folded `if` conditions. Update IRBuilder to apply prologue and epilogue source locations when beginning and ending a function scope. Add basic support in the parser for explicitly tracking annotations on module fields, although only do anything with them in the case of prologue source location annotations.
* OptimizeAddedConstants: Replace an assert with a proper error (#6375)Alon Zakai2024-03-041-2/+5
| | | See #6373
* Typed continuations: cont.bind instructions (#6365)Frank Emrich2024-03-0426-9/+264
| | | | | | | | | | | | | | | | | | | | | | | | This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `cont.bind` instruction for partially applying continuations, documented [here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). In short, these instructions are of the form `(cont.bind $ct_before $ct_after)` where `$ct_before` and `$ct_after` are related continuation types. They must only differ in the number of arguments, where `$ct_before` has _n_ additional parameters as compared to `$ct_after`, for some _n_ ≥ 0. The idea is that `(cont.bind $ct_before $ct_after)` then takes a reference to a continuation of type `$ct_before` as well as _n_ operands and returns a (reference to a) continuation of type `$ct_after`. Thus, the folded textual representation looks like `(cont.bind $ct_before $ct_after arg1 ... argn c)`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction.
* Fuzzer: Mark Roundtrip pass as adding effects (#6366)Alon Zakai2024-02-291-0/+6
|
* [Parser] Support inline data in 64-bit memory declarations (#6364)Thomas Lively2024-02-291-7/+24
| | | | This new form of the abbreviated memory declaration with inline data is introduced in the memory64 proposal.
* [Parser] Do not require a memory for GC string ops (#6363)Thomas Lively2024-02-292-12/+54
| | | | | We previously required a memory to exist while parsing all `StringNew` and `StringEncode` instructions, even though some variants of the instructions use GC arrays instead. Require a memory only for those instructions that use one.
* [NFC] Add the type of the Expression when eliding it (#6362)Alon Zakai2024-02-281-1/+2
| | | | | | | | | | In some cases we don't print an Expression in full if it is unreachable, so we print something instead as a placeholder. This happens in unreachable code when the children don't provide enough info to print the parent (e.g. a StructGet with an unreachable reference doesn't know what struct type to use). This PR prints out the name of the Expression type of such things, which can help debugging sometimes.
* C API: Support adding data segments individually (#6346)Lingming Zhang2024-02-282-0/+26
| | | Fixes #6314.
* [NFC] Add some comments about flow in SubtypingDiscoverer and Unsubtyping ↵Alon Zakai2024-02-282-0/+18
| | | | | | | (#6359) I audited all of SubtypingDiscoverer for flow/non-flow constraints and added some comments to clarify things for our future selves if we ever need to generalize it.
* [Outlining] Fixes break reconstruction (#6352)Ashley Nelson2024-02-273-6/+74
| | | Adds new visitBreakWithType and visitSwitchWithType functions to the IRBuilder API. These functions work around an assumption in IRBuilder that the module is being traversed in the fully nested format, i.e., that the destination scope of a break or switch has been visited before visiting the break or switch. Instead, the type of the destination scope is passed to IRBuilder.
* SubtypingDiscoverer: Differentiate non-flow subtyping constraints (#6344)Alon Zakai2024-02-273-2/+38
| | | | | | | | | | | | | | | | | | When we do a local.set of a value into a local then we have both a subtyping constraint - for the value to be valid to put in that local - and also a flow of a value, which can then reach more places. Such flow then interacts with casts in Unsubtyping, since it needs to know what can flow where in order to know how casts force us to keep subtyping relations. That regressed in the not-actually-NFC #6323 in which I added the innocuous lines to add subtyping constraints in ref.eq. It seems fine to require that the arms of a RefEq must be of type eqref, but Unsubtyping then assuming those arms flowed into a location of type eqref... which means casts might force us to not optimize some things. To fix this, differentiate the rare case of non-flowing subtyping constraints, which is basically only RefEq. There are perhaps a few more cases (like i31 operations) but they do not matter in practice for Unsubtyping anyhow; I suggest we land this first to undo the regression and then at our leisure investigate the other instructions.
* [NFC] Use ifdef-else in threads.cpp (#6355)Alon Zakai2024-02-271-2/+2
|
* [StringLowering] Lower `stringview_wtf16.get_codeunit` to `charCodeAt` (#6353)Thomas Lively2024-02-261-4/+4
| | | | Previously we lowered this to `getCodePointAt`, which has different semantics around surrogate pairs.
* [Parser] Parse annotations, including source map comments (#6345)Thomas Lively2024-02-267-1186/+2332
| | | | | | | | | | Parse annotations using the standards-track `(@annotation ...)` format as well as the `;;@ source-map:0:1` format. Have the lexer implicitly collect annotations while it skips whitespace and add lexer APIs to access the annotations since the last token was parsed. Collect annotations before parsing each instruction and pass the annotations explicitly to the parser and parser context functions for instructions. Add an API to `IRBuilder` to set a debug location to be attached to the next visited or created instruction and use it from the parser.
* [Emscripten port] Fix core count logic for Emscripten+pthreads (#6350)Alon Zakai2024-02-261-3/+5
| | | | Before this all Emscripten builds would use 1 core, but it is important to allow pthreads builds there to use more.
* Implement dropping of active Element Segments (#6343)Alon Zakai2024-02-231-10/+17
| | | | Also rename the existing droppedSegments to droppedDataSegments for clarity.
* [Parser] Condense redundant pop values (#6339)Ashley Nelson2024-02-221-13/+1
| | | A bit of clean-up, changes getBranchValue to use pop().
* Typed continuations: cont.new instructions (#6308)Frank Emrich2024-02-2226-28/+187
| | | | | | | | | | | | | | | | | This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `cont.new` instruction for creating continuations, documented [here(https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). In short, these instructions are of the form `(cont.new $ct)` where `$ct` must be a continuation type. The instruction takes a single (nullable) function reference as its argument, which means that the folded representation of the instruction is of the form `(cont.new $ct (foo ...))`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction.
* Fuzzer: Allow using initial content with V8 (#6327)Alon Zakai2024-02-221-0/+9
| | | | | | | | | | | | | | | One problem was that spec testcases had exports with names that are not valid to write as JS exports.name. For example an export with a - in the name would end up as exports.foo-bar etc. Since #6310 that is fixed as we do not emit such JS (we use the generic fuzz_shell.js script which iterates over the keys in exports with exports[name]). Also fix a few trivial fuzzer issues that initial content uncovered: - Ignore a wat file with invalid utf-8. - Print string literals in the same way from JS as from C++. - Enable the stringref flag in V8. - Remove tag imports (the same as we do for global and function and other imports).
* Fuzzer: Match the logging of i31ref between JS and C++ (#6335)Alon Zakai2024-02-221-21/+35
| | | | | | | | | | | | | JS engines print i31ref as just a number, so we need a small regex to standardize the representation (similar to what we do for funcrefs on the code above). On the C++ side, make it actually print the i31ref rather than treat it like a generic reference (for whom we only print "object"). To do that we must unwrap an externalized i31 as necessary, and add a case for i31 in the printing logic. Also move that printing logic to its own function, as it was starting to get quite long.
* [Parser][NFC] Remove `Token` from lexer interface (#6333)Thomas Lively2024-02-222-44/+46
| | | | | | Replace the general `peek` method that returned a `Token` with specific peek methods that look for (but do not consume) specific kinds of tokens. This change is a prerequisite for simplifying the lexer implementation by removing `Token` entirely.
* [Parser][NFC] Remove parser/input.h (#6332)Thomas Lively2024-02-226-111/+31
| | | | Remove the layer of abstraction sitting between the parser and the lexer now that the lexer has an interface the parser can use directly.