forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add support for debug printing of functions (#5828)	Alon Zakai	2023-07-20	1	-0/+1
\|
*	[Strings] Adopt new instruction binary encoding (#5714)	Jérôme Vouillon	2023-05-12	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	See WebAssembly/stringref#46. This format is already adopted by V8: https://chromium-review.googlesource.com/c/v8/v8/+/3892695. The text format is left unchanged (see #5607 for a discussion on the subject). I have also added support for string.encode_lossy_utf8 and string.encode_lossy_utf8 array (by allowing the replace policy for Binaryen's string.encode_wtf8 instruction).
*	[analysis] Add a new iterable CFG utility (#5712)	Thomas Lively	2023-05-12	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new "analysis" source directory that will contain the source for a new static program analysis framework. To start the framework, add a CFG utility that provides convenient iterators for iterating through the basic blocks of the CFG as well as the predecessors, successors, and contents of each block. The new CFGs are constructed using the existing CFGWalker, but they are different in that the new utility is meant to provide a usable representation of a CFG whereas CFGWalker is meant to allow collecting arbitrary information about each basic block in a CFG. For testing and debugging purposes, add `print` methods to CFGs and basic blocks. This requires exposing the ability to print expression contents excluding children, which was something we previously did only for StackIR. Also add a new gtest file with a test for constructing and printing a CFG. The test reveals some strange properties of the current CFG construction, including empty blocks and strange placement of `loop` instructions, but fixing these problems is left as future work.
*	[NFC] Track the kinds of items that names refer to in ↵	Alon Zakai	2023-05-05	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \|	wasm-delegations-fields (#5690) This makes delegations-fields track Kinds. That is, rather than say a field is just a Name, we can say it is a name of kind Function. This allows users to track references to functions, tables, memories, etc., in a simple and generic way, avoiding duplicated code which we have atm. (In particular this will help wasm-merge in the future.) This also uses that functionality in two small places to show the benefits (see memory-utils.cpp and MemoryPacking.cpp).
*	[NFC] Refactor each of ArrayNewSeg and ArrayInit into subclasses for ↵	Alon Zakai	2023-05-04	1	-19/+32
\| \| \| \| \| \| \| \| \| \| \|	Data/Elem (#5692) ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem ArrayInit => ArrayInitData, ArrayInitElem Basically we remove the opcode and use the class type to differentiate them. This adds some code but it makes the representation simpler and more compact in memory, and it will help with #5690
*	Implement array.fill, array.init_data, and array.init_elem (#5637)	Thomas Lively	2023-04-06	1	-0/+34
\| \| \| \| \|	These complement array.copy, which we already supported, as an initial complete set of bulk array operations. Replace the WIP spec tests with the upstream spec tests, lightly edited for compatibility with Binaryen.
*	Only update functions in optimizeAfterInlining() (#5624)	Alon Zakai	2023-04-05	1	-0/+1
\| \| \| \| \|	This saves the work of freeing and allocating for all the other maps. This is a code path that is used by several passes so it showed up in profiling for #5561
*	Use Names instead of indices to identify segments (#5618)	Thomas Lively	2023-04-04	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	All top-level Module elements are identified and referred to by Name, but for historical reasons element and data segments were referred to by index instead. Fix this inconsistency by using Names to refer to segments from expressions that use them. Also parse and print segment names like we do for other elements. The C API is partially converted to use names instead of indices, but there are still many functions that refer to data segments by index. Finishing the conversion can be done in the future once it becomes necessary.
*	[NFC] Internally rename `ArrayInit` to `ArrayNewFixed` (#5526)	Thomas Lively	2023-02-28	1	-3/+3
\| \| \| \| \| \| \| \|	To match the standard instruction name, rename the expression class without changing any parsing or printing behavior. A follow-on PR will take care of the functional side of this change while keeping support for parsing the old name. This change will allow `ArrayInit` to be used as the expression class for the upcoming `array.init_data` and `array.init_elem` instructions.
*	[Strings] Add experimental string.hash instruction (#5480)	Alon Zakai	2023-02-03	1	-0/+1
\| \| \|	See WebAssembly/stringref#60
*	[Wasm GC] Add AbstractTypeRefining pass (#5461)	Alon Zakai	2023-02-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a type hierarchy has abstract classes in the middle, that is, types that are never instantiated, then we can optimize casts and other operations to them. Say in Java that we have `AbstractList`, and it only has one subclass `IntList` that is ever created, then any place we have an `AbstractList` we must actually have an `IntList`, or a null. (Or, if no subtype is instantiated, then the value must definitely be a null.) The actual implementation does a type mapping, that is, it finds all places using an abstract type and makes them refer to the single instantiated subtype (or null). After that change, no references to the abstract type remain in the program, so this both refines types and also cleans up the type section.
*	[Strings] Add experimental StringNew variants (#5459)	Alon Zakai	2023-01-26	1	-1/+7
\| \| \| \| \| \|	string.from_code_point makes a string from an int code point. string.new_utf8*_try makes a utf8 string and returns null on a UTF8 encoding error rather than trap.
*	[Strings] Add string.compare (#5453)	Alon Zakai	2023-01-25	1	-0/+7
\| \| \|	See WebAssembly/stringref#58
*	Represent ref.as_{func,data,i31} with RefCast (#5413)	Thomas Lively	2023-01-10	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	These operations are deprecated and directly representable as casts, so remove their opcodes in the internal IR and parse them as casts instead. For now, add logic to the printing and binary writing of RefCast to continue emitting the legacy instructions to minimize test changes. The few test changes necessary are because it is no longer valid to perform a ref.as_func on values outside the func type hierarchy now that ref.as_func is subject to the ref.cast validation rules. RefAsExternInternalize, RefAsExternExternalize, and RefAsNonNull are left unmodified. A future PR may remove RefAsNonNull as well, since it is also expressible with casts.
*	Replace `RefIs` with `RefIsNull` (#5401)	Thomas Lively	2023-01-09	1	-13/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Replace `RefIs` with `RefIsNull` The other `ref.is` instructions are deprecated and expressible in terms of `ref.test`. Update binary and text parsing to parse those instructions as `RefTest` expressions. Also update the printing and emitting of `RefTest` expressions to emit the legacy instructions for now to minimize test changes and make this a mostly non-functional change. Since `ref.is_null` is the only `RefIs` instruction left, remove the `RefIsOp` field and rename the expression class to `RefIsNull`. The few test changes are due to the fact that `ref.is` instructions are now subject to `ref.test` validation, and in particular it is no longer valid to perform a `ref.is_func` on a value outside of the `func` type hierarchy.
*	Consolidate br_on* operations (#5399)	Thomas Lively	2023-01-06	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `br_on{_non}_{data,i31,func}` operations are deprecated and directly representable in terms of the new `br_on_cast` and `br_on_cast_fail` instructions, so remove their dedicated IR opcodes in favor of representing them as casts. `br_on_null` and `br_on_non_null` cannot be consolidated the same way because their behavior is not directly representable in terms of `br_on_cast` and `br_on_cast_fail`; when the cast to null bottom type succeeds, the null check instructions implicitly drop the null value whereas the cast instructions would propagate it. Add special logic to the binary writer and printer to continue emitting the deprecated instructions for now. This will allow us to update the test suite in a separate future PR with no additional functional changes. Some tests are updated because the validator no longer allows passing non-func data to `br_on_func`. Doing so has not made sense since we separated the three reference type hierarchies.
*	Support br_on_cast null (#5397)	Thomas Lively	2023-01-05	1	-3/+2
\| \| \| \| \| \| \| \| \|	As well as br_on_cast_fail null. Unlike the existing br_on_cast* instructions, these new instructions treat the cast as succeeding when the input is a null. Update the internal representation of the cast type in `BrOn` expressions to be a `Type` rather than a `HeapType` so it will include nullability information. Also update and improve `RemoveUnusedBrs` to handle the new instructions correctly and optimize in more cases.
*	Support `ref.test null` (#5368)	Thomas Lively	2022-12-21	1	-3/+2
\| \| \|	This new variant of ref.test returns 1 if the input is null.
*	Update RefCast representation to drop extra HeapType (#5350)	Thomas Lively	2022-12-20	1	-2/+8
\| \| \| \| \| \| \| \| \|	The latest upstream version of ref.cast is parameterized with a target reference type, not just a heap type, because the nullability of the result is parameterizable. As a first step toward implementing these new, more flexible ref.cast instructions, change the internal representation of ref.cast to use the expression type as the cast target rather than storing a separate heap type field. For now require that the encoded semantics match the previously allowed semantics, though, so that none of the optimization passes need to be updated.
*	Rename UserSection -> CustomSection. NFC (#5288)	Sam Clegg	2022-11-22	1	-2/+2
\| \| \|	This reflects that naming used in the spec.
*	Switch from `typedef` to `using` in C++ code. NFC (#5258)	Sam Clegg	2022-11-15	1	-4/+4
\| \| \| \|	This is more modern and (IMHO) easier to read than that old C typedef syntax.
*	Implement `array.new_data` and `array.new_elem` (#5214)	Thomas Lively	2022-11-07	1	-0/+18
\| \| \| \| \| \| \| \| \|	In order to test them, fix the binary and text parsers to accept passive data segments even if a module has no memory. In addition to parsing and emitting the new instructions, also implement their validation and interpretation. Test the interpretation directly with wasm-shell tests adapted from the upstream spec tests. Running the upstream spec tests directly would require fixing too many bugs in the legacy text parser, so it will have to wait for the new text parser to be ready.
*	Multi-Memories Lowering Pass (#5107)	Ashley Nelson	2022-11-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Adds a multi-memories lowering pass that will create a single combined memory from the memories added to the module. This pass assumes that each memory is configured the same (type, shared). This pass also: - replaces existing memory.size instructions with a custom function that returns the size of each memory as if they existed independently - replaces existing memory.grow instructions with a custom function, using global offsets to track the page size of each memory so data doesn't overlap in the singled combined memory - adjusts the offsets of active data segments - adjusts the offsets of Loads/Stores
*	[NFC] Add nullptr init for ElementSegment offset (#5168)	Alon Zakai	2022-10-20	1	-1/+1
\| \| \| \| \|	I believe all locations that create one already set it (or else we'd see errors), but it's not easy to see that when reading the code. And other similar locations (like DataSegment) do initialize to null, so do so for consistency.
*	Make `Name` a pointer, length pair (#5122)	Thomas Lively	2022-10-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
*	Fix bugs with copying expressions (#5099)	Thomas Lively	2022-09-30	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It does not make sense to construct an `Expression` directly because all expressions must be specific expressions. However, we previously allowed constructing Expressions, and in particular we allowed them to be copy constructed. Unrelatedly, `Fatal::operator<<` took its argument by value. Together, these two facts produced UB when printing Expressions in fatal error messages because a new Expression would be copy constructed with the original expression ID but without any of the actual data from the original specific expression. For example, when trying to print a Block, the printing code would try to look at the expression list, but the expression list would be junk stack data because the copied Expression does not contain an expression list. Fix the problem by making Expression's constructors visible only to its subclasses and making `Fatal::operator<<` take its argument by forwarding reference instead of by value.
*	Add JavaScript promise integration (JSPI) pass. (#4961)	Brendan Dahl	2022-09-02	1	-0/+1
\| \| \| \| \| \| \|	Add a pass that wraps all imports and exports with functions that handle storing and passing along the suspender externref needed for JSPI. https://github.com/WebAssembly/js-promise-integration/blob/main/proposals/js-promise-integration/Overview.md
*	Implement `extern.externalize` and `extern.internalize` (#4975)	Thomas Lively	2022-08-29	1	-0/+2
\| \| \| \|	These new GC instructions infallibly convert between `extern` and `any` references now that those types are not in the same hierarchy.
*	Mutli-Memories Support in IR (#4811)	Ashley Nelson	2022-08-17	1	-4/+23
\| \| \| \| \| \| \|	This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction. It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format. There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
*	[Strings] string.new.array methods have start:end arguments (#4888)	Alon Zakai	2022-08-09	1	-0/+4
\|
*	Remove RTTs (#4848)	Thomas Lively	2022-08-05	1	-59/+0
\| \| \| \| \| \| \|	RTTs were removed from the GC spec and if they are added back in in the future, they will be heap types rather than value types as in our implementation. Updating our implementation to have RTTs be heap types would have been more work than deleting them for questionable benefit since we don't know how long it will be before they are specced again.
*	[Strings] GC variants for string.encode (#4817)	Alon Zakai	2022-07-21	1	-0/+10
\|
*	Remove basic reference types (#4802)	Thomas Lively	2022-07-20	1	-3/+5
\| \| \| \| \| \| \| \| \|	Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to accidentally forget to handle reference types with the same basic HeapTypes but the opposite nullability. In principle there is nothing special about the types with shorthands except in the binary and text formats. Removing these shorthands from the internal type representation by removing all basic reference types makes some code more complicated locally, but simplifies code globally and encourages properly handling both nullable and non-nullable reference types.
*	[Strings] Add string.new GC variants (#4813)	Alon Zakai	2022-07-19	1	-2/+12
\|
*	[Strings] stringview_wtf16.length (#4809)	Alon Zakai	2022-07-18	1	-0/+1
\| \| \| \|	This measures the length of a view, so it seems simplest to make it a sub-operation of the existing measure instruction.
*	[Strings] stringview_*.slice (#4805)	Alon Zakai	2022-07-15	1	-0/+31
\| \| \| \| \| \| \|	Unfortunately one slice is the same as python [start:end], using 2 params, and the other slice is one param, [CURR:CURR+num] (where CURR is implied by the current state in the iter). So we can't use a single class here. Perhaps a different name would be good, like slice vs substring (like JS does), but I picked names to match the current spec.
*	[Strings] stringview access operations (#4798)	Alon Zakai	2022-07-13	1	-0/+55
\|
*	[Strings] string.as (#4797)	Alon Zakai	2022-07-12	1	-0/+18
\|
*	[Strings] string.is_usv_sequence (#4783)	Alon Zakai	2022-07-08	1	-0/+1
\| \| \| \| \| \| \|	This implements it as a StringMeasure opcode. They do have the same number of operands, same trapping behavior, and same return type. They both get a string and do some inspection of it to return an i32. Perhaps the name could be StringInspect or something like that, rather than StringMeasure..? But I think for now this might be good enough, and the spec may change anyhow later.
*	[Strings] string.eq (#4781)	Alon Zakai	2022-07-08	1	-0/+11
\|
*	[Strings] string.concat (#4777)	Alon Zakai	2022-07-08	1	-0/+11
\|
*	[Strings] string.encode (#4776)	Alon Zakai	2022-07-07	1	-0/+19
\|
*	[Strings] string.measure (#4775)	Alon Zakai	2022-07-07	1	-0/+18
\|
*	[Strings] Add string.const (#4768)	Alon Zakai	2022-07-06	1	-0/+13
\| \| \| \| \|	This is more work than a typical instruction because it also adds a new section: all the (string.const "foo") strings are put in a new "strings" section in the binary, and the instructions refer to them by index.
*	[Strings] Add string.new* instructions (#4761)	Alon Zakai	2022-06-29	1	-0/+20
\| \| \| \| \| \|	This is the first instruction from the Strings proposal. This includes everything but interpreter support.
*	First class Data Segments (#4733)	Ashley Nelson	2022-06-21	1	-29/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Updating wasm.h/cpp for DataSegments * Updating wasm-binary.h/cpp for DataSegments * Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal * checking isPassive when copying data segments to know whether to construct the data segment with an offset or not * Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp * Updated wasm-interpreter * First look at updating Passes * Updated wasm-s-parser * Updated files in src/ir * Updating tools files * Last pass on src files before building * added visitDataSegment * Fixing build errors * Data segments need a name * fixing var name * ran clang-format * Ensuring a name on DataSegment * Ensuring more datasegments have names * Adding explicit name support * Fix fuzzing name * Outputting data name in wasm binary only if explicit * Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames * Pass on when data segment names are explicitly set * Ran auto_update_tests.py and check.py, success all around * Removed an errant semi-colon and corrected a counter. Everything still passes * Linting * Fixing processing memory names after parsed from binary * Updating the test from the last fix * Correcting error comment * Impl kripken@ comments * Impl tlively@ comments * Updated tests that remove data print when == 0 * Ran clang format * Impl tlively@ comments * Ran clang-format
*	Update relaxed SIMD instructions	Thomas Lively	2022-06-07	1	-2/+0
\| \| \| \| \|	Update the opcodes for all relaxed SIMD instructions and remove the unsigned dot product instructions that are no longer in the proposal.
*	Make RefCast safe by default (#4663)	Thomas Lively	2022-05-12	1	-1/+1
\| \| \| \|	This prevents new `RefCast` expressions that don't explicitly have their safety set from getting an unitialized safety value.
*	Add ref.cast_nop_static (#4656)	Thomas Lively	2022-05-11	1	-0/+5
\| \| \| \| \| \|	This unsafe experimental instruction is semantically equivalent to ref.cast_static, but V8 will unsafely turn it into a nop. This is meant to help us measure cast overhead more precisely than we can by globally turning all casts into nops.
*	Implement relaxed SIMD dot product instructions (#4586)	Thomas Lively	2022-04-11	1	-0/+4
\| \| \|	As proposed in https://github.com/WebAssembly/relaxed-simd/issues/52.