forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Wasm GC] Add a method to traverse subtypes (#5131)	Alon Zakai	2022-10-12	1	-0/+51
\| \| \|	This will be useful in further cone type optimizations.
*	[Wasm GC] Fix the intersection of a bottom type null (#5128)	Alon Zakai	2022-10-12	1	-0/+9
\| \| \| \| \|	When the heap types are not subtypes of each other, but a null is possible, the intersection exists and is a null. That null must be the shared bottom type.
*	[Wasm GC] [GUFA] Add initial ConeType support (#5116)	Alon Zakai	2022-10-11	1	-23/+405
\| \| \| \| \| \| \| \| \| \| \|	A cone type is a PossibleContents that has a base type and a depth, and it contains all subtypes up to that depth. So depth 0 is an exact type from before, etc. This only adds cone type computations when combining types, that is, when we combine two exact types we might get a cone, etc. This does not yet use the cone info in all places (like struct gets and sets), and it does not yet define roots of cone types, all of which is left for later. IOW this is the MVP of cone types that is just enough to add them + pass tests + test the new functionality.
*	Implement bottom heap types (#5115)	Thomas Lively	2022-10-07	2	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.
*	Fix compiler warnings under emcc. NFC (#5104)	Sam Clegg	2022-10-03	1	-3/+3
\| \| \| \| \|	These comparisons were showing up as warnings with the latest emcc/clang. I only noticed because I ran `ninja` to build everything whereas we only test the `binaryen` target under emscripten normally.
*	[Wasm GC] Fix .depth() and add testing (#5102)	Alon Zakai	2022-09-30	1	-0/+38
\|
*	[GUFA] Test that combined things intersect with the combination. (#5094)	Alon Zakai	2022-09-29	1	-0/+113
\| \| \| \|	Just an automatic test that A has an intersection with A + B for the contents in our unit test.
*	[GUFA] Improve hashing (#5091)	Alon Zakai	2022-09-28	1	-0/+22
\| \| \| \|	Avoid manually doing bitshifts etc. - leave combining to the core hash logic, which can do a better job.
*	[GUFA] Fix haveIntersection on comparing nullable with non-nullable (#5089)	Alon Zakai	2022-09-28	1	-0/+7
\| \| \| \| \|	We compared types and not heap types, so a difference in nullability confused us. But at that point in the code, we've ruled out nulls, so we should focus on heap types only.
*	[GUFA] Infer a RefEq value of 0 when possible (#5081)	Alon Zakai	2022-09-26	1	-0/+50
\| \| \| \|	If the PossibleContents for the two sides have no possible intersection then the result must be 0.
*	[C API] Make TypeBuilderSetSubType take a heap type (#5045)	dcode	2022-09-23	1	-0/+86
\| \| \|	Fixes #5041
*	Separate `func` into a separate type hierarchy (#4955)	Thomas Lively	2022-08-22	2	-8/+19
\| \| \| \| \|	Just like `extern` is no longer a subtype of `any` in the new GC type system, `func` is no longer a subtype of `any`, either. Make that change in our type system implementation and update tests and fuzzers accordingly.
*	Remove RTTs (#4848)	Thomas Lively	2022-08-05	1	-8/+3
\| \| \| \| \| \| \|	RTTs were removed from the GC spec and if they are added back in in the future, they will be heap types rather than value types as in our implementation. Updating our implementation to have RTTs be heap types would have been more work than deleting them for questionable benefit since we don't know how long it will be before they are specced again.
*	Update reference type Literal constructors to use HeapType (#4857)	Thomas Lively	2022-08-01	1	-1/+1
\| \| \| \| \| \|	We already require non-null literals to have non-null types, but with this change we can enforce that constraint by construction. Also remove the default behavior of creating a function reference literal with heap type `func`, since there is always a more specific function type to use.
*	Grand Unified Flow Analysis (GUFA) (#4598)	Alon Zakai	2022-07-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This tracks the possible contents in the entire program all at once using a single IR. That is in contrast to say DeadArgumentElimination of LocalRefining etc., all of whom look at one particular aspect of the program (function params and returns in DAE, locals in LocalRefining). The cost is to build up an entire new IR, which takes a lot of new code (mostly in the already-landed PossibleContents). Another cost is this new IR is very big and requires a lot of time and memory to process. The benefit is that this can find opportunities that are only obvious when looking at the entire program, and also it can track information that is more specialized than the normal type system in the IR - in particular, this can track an ExactType, which is the case where we know the value is of a particular type exactly and not a subtype.
*	Remove basic reference types (#4802)	Thomas Lively	2022-07-20	2	-8/+15
\| \| \| \| \| \| \| \| \|	Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to accidentally forget to handle reference types with the same basic HeapTypes but the opposite nullability. In principle there is nothing special about the types with shorthands except in the binary and text formats. Removing these shorthands from the internal type representation by removing all basic reference types makes some code more complicated locally, but simplifies code globally and encourages properly handling both nullable and non-nullable reference types.
*	PossibleContents + ContentOracle (#4685)	Alon Zakai	2022-06-21	2	-0/+263
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pulls out the core PossibleContents and ContentOracle classes from the very large #4598, making a smaller PR that can be reviewed first. This includes unit tests for the code, but comprehensive testing will only appear in the later PR, when a new pass is added that uses all this. PossibleContents tracks the possible contents at particular locations in the program. It can track constant values as well as "this must contain this exact type", which is more than wasm itself can indicate. *Location structs are provided to declare locations in the wasm, like the location of a local or of a function argument. ContentOracle analyzes the entire program, and can then map a Location to the PossibleContents there, which a later pass will use to optimize.
*	[Parser][NFC] Small code cleanups (#4729)	Thomas Lively	2022-06-14	1	-184/+184
\| \| \| \|	Apply cleanups suggested by aheejin in post-merge code review of previous parser PRs.
*	[Parser] Begin parsing modules (#4716)	Thomas Lively	2022-06-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Implement the basic infrastructure for the full WAT parser with just enough detail to parse basic modules that contain only imported globals. Parsing functions correspond to elements of the grammar in the text specification and are templatized over context types that correspond to each phase of parsing. Errors are explicitly propagated via `Result<T>` and `MaybeResult<T>` types. Follow-on PRs will implement additional phases of parsing and parsing for new elements in the grammar.
*	[Parser] Token classification (#4699)	Thomas Lively	2022-06-01	1	-8/+521
\| \| \| \| \| \| \| \| \| \| \| \|	Add methods to `Token` for determining whether the token can be interpreted as a particular token type, returning the interpreted value as appropriate. These methods perform additional bounds checks for integers and NaN payloads that could not be done during the initial lexing because the lexer did not know what the intended token type was. The float methods also reinterpret integer tokens as floating point tokens since the float grammar is a superset of the integer grammar and inject the NaN payloads into parsed NaN values. Move all bounds checking to these new classifier functions to have it in one place.
*	[Parser] Replace Signedness with ternary Sign (#4698)	Thomas Lively	2022-05-27	1	-42/+42
\| \| \| \| \| \| \| \|	Previously we were tracking whether integer tokens were signed but we did not differentiate between positive and negative signs. Unfortunately, without differentiating them, there's no way to tell the difference between an in-bounds negative integer and a wildly out-of-bounds positive integer when trying to perform bounds checks for s32 tokens. Fix the problem by tracking not only whether there is a sign on an integer token, but also what the sign is.
*	[Parser][NFC] Create a public wat-lexer.h header (#4695)	Thomas Lively	2022-05-27	2	-16/+34
\| \| \| \| \| \|	wat-parser-internal.h was already quite large after implementing just the lexer, so it made sense to rename it to be lexer-specific and start a new file for the higher-level parser. Also make it a proper .cpp file and split the testable interface out into wat-lexer.h.
*	[Parser] Lex floating point values (#4693)	Thomas Lively	2022-05-26	1	-4/+490
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than trying to actually implement the parsing of float values, which cannot be done naively due to precision concerns, just parse the float grammar then postprocess the parsed text into a form we can pass to `strtod` to do the actual parsing of the value. Since the float grammar reuses `num` and `hexnum` from the integer grammar but does not care about overflow, add a mode to `LexIntCtx`, `num`, and `hexnum` to allow parsing overflowing numbers. For NaNs, store the payload as a separate value rather than as part of the parsed double. The payload will be injected into the NaN at a higher level of the parser once we know whether we are parsing an f64 or an f32 and therefore know what the allowable payload values are.
*	[Parser] Lex keywords (#4688)	Thomas Lively	2022-05-25	1	-0/+29
\| \| \| \| \|	Also include reserved words that look like keywords to avoid having to find and enumerate all the valid keywords. Invalid keywords will be rejected at a higher level in the parser instead.
*	[Parser] Lex strings (#4687)	Thomas Lively	2022-05-25	1	-0/+102
\|
*	[Parser] Lex idchar and identifiers (#4686)	Thomas Lively	2022-05-25	1	-0/+21
\|
*	[Parser] Start a new text format parser (#4680)	Thomas Lively	2022-05-24	2	-0/+350
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Begin implementing a new text format parser that will accept the standard text format. Start with a lexer that can iterate over tokens in an underlying text buffer. The initial supported tokens are integers, parentheses, and whitespace including comments. The implementation is in a new private internal header so it can be included into a gtest source file even though it is not meant to be a public API. Once the parser is more complete, there will be an additional public header exposing a more concise public API and the private header will be included into a source file that implements that public API. The new parser will improve on the existing text format parser not only because it will accept the full standard text format, but also because its code will be simpler and easier to maintain and because it will hopefully be faster as well. The new parser will be built out of small functions that closely mirror the grammar productions given in the spec and will heavily use C++17 features like string_view, optional, and variant to provide more self-documenting and efficient code. Future PRs will add support for lexing other kinds of tokens followed by support for parsing more complex constructs.
*	Allow TypeBuilder::grow to take 0 as an argument (#4668)	Thomas Lively	2022-05-16	1	-0/+2
\| \| \| \| \| \| \|	There's no reason not to allow growing by zero slots, but previously doing so would trigger an assertion. This caused a crash when roundtripping a trivial module. Fixes #4667.
*	Add SubTypes::getAllSubTypes variant which includes the type itself (#4649)	Alon Zakai	2022-05-13	1	-0/+30
\| \| \| \|	This also includes the type itself in the returned vector. This will be useful in a future PR.
*	Remove externref (#4633)	Thomas Lively	2022-05-04	1	-11/+11
\| \| \| \| \| \|	Remove `Type::externref` and `HeapType::ext` and replace them with uses of anyref and any, respectively, now that we have unified these types in the GC proposal. For backwards compatibility, continue to parse `extern` and `externref` and maintain their relevant C API functions.
*	Revert "Re-enable a previously flaky type test (#4582)" (#4591)	Thomas Lively	2022-04-11	1	-1/+1
\| \| \|	This reverts commit 40a998c00eb42b65ddc1d42c1c009690bbd05cca.
*	Re-enable a previously flaky type test (#4582)	Thomas Lively	2022-04-05	1	-1/+1
\| \| \| \| \|	I don't know what exactly was causing this test to flake, but since it was disabled we added the type fuzzer and fixed a lot of bugs, so I hope it is no longer flaky. If that turns out to be wrong, I can dig deeper.
*	Change `TypeTest` to use isorecursive types (#4540)	Thomas Lively	2022-03-22	2	-5/+6
\| \| \| \|	Isorecursive typing is the future standard type system, so change generic type tests to use it.
*	[NFC] Move type system test fixture to its own header (#4539)	Thomas Lively	2022-03-22	2	-31/+41
\| \| \| \|	Allow the fixture to be shared with other test .cpp files we might add in the future.
*	Make IndexedTypeNameGenerator more powerful (#4511)	Thomas Lively	2022-02-09	1	-2/+45
\| \| \| \| \|	Allow IndexedTypeNameGenerator to be configured with a custom prefix and also allow it to be parameterized with an explicit fallback generator. This allows multiple IndexedTypeNameGenerators to be composed together, for example.
*	Eagerly canonicalize basic types (#4507)	Thomas Lively	2022-02-08	1	-0/+27
\| \| \| \| \| \| \| \| \|	We were already eagerly canonicalizing basic HeapTypes when building types so the more complicated canonicalization algorithms would not have to handle noncanonical heap types, but we were not doing the same for Types. Equirecursive canonicalization was properly handling noncanonical Types everywhere, but isorecursive canonicalization was not. Rather than update isorecursive canonicalization in multiple places, remove the special handling from equirecursive canonicalization and canonicalize types in a single location.
*	Validate supertypes after isorecursive canonicalization (#4506)	Thomas Lively	2022-02-07	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Validating nominally declared supertypes depends on being able to test type equality, which in turn depends on the compared types having already been canonicalized. Previously we validated supertypes before canonicalization, so validation would fail in cases where it should have succeeded. Fix the bug by canonicalizing first. This means that the global type store can now end up holding invalid types, but those types will never be exposed to the user, so that's not a huge problem. Also fix an unrelated bug that was preventing the test from passing, which is that supertypes were being hashed and compared without the special logic for detecting self-references. This bug preventing the equivalent recursion groups in the test from being deduplicated.
*	Disable a flaky isorecursion test (#4497)	Thomas Lively	2022-02-02	1	-1/+1
\| \| \| \|	The IsorecursiveTest.CanonicalizeSelfReferences has been frequently failing on Windows and MacOS CI. Disable it for now until I can investigate thoroughly.
*	Isorecursive HeapType constructors (#4482)	Thomas Lively	2022-01-29	1	-0/+29
\| \| \| \| \|	Update the HeapType constructors that take Signature, Structs, and Arrays to work properly with isorecursive typing. This is particularly important for the Signature constructor, which is used frequently throughout the code base.
*	Isorecursive canonicalization (#4481)	Thomas Lively	2022-01-29	1	-9/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Isorecursive canonicalization is similar to equirecursive canonicalization in that it deduplicates types based on their structure, but unlike equirecursive canonicalization, isorecursive canonicalization does not need to minimize the type definition first. Another difference is that structures are deduplicated at the level of recursion groups rather than individual types, so we cannot reuse the existing `FiniteShapeHasher` and `FiniteShapeEquator` utilities. Instead, introduce a new `RecGroupStructure` wrapper that provides equality comparison and hashing very similar to the former utilities. Another feature of the isorecursive type system is that its constraints on the order of recursion groups means that they are already topologically sorted and can be incrementally canonicalized in a bottom-up manner. This incremental canonicalization means that the `RecGroupStructure` utility can assume that all child `HeapTypes` have already been canonicalized and can avoid traversing anything but the top-level type definitions in the wrapped recursion group. The only exception is self-references into the wrapped recursion group itself, which may not be canonical. That special case is detected and handled without any nontrivial extra work. Overall, the canonicalization algorithm traverses each type definition once to canonicalize its `HeapType` use sites, then once to hash its recursion group structure, then finally once to canonicalize its `Type` use sites in `globallyCanonicalize`, giving only linear time complexity.
*	Add a HeapType method for getting the rec group index (#4480)	Thomas Lively	2022-01-27	1	-0/+27
\| \| \| \| \|	Storing the rec group index on the HeapTypeInfo avoids having to do a linear scan through the rec group to find the index for a particular type. This will be important for isorecursive canonicalization, which uses rec group indices.
*	Isorecursive type validation (#4475)	Thomas Lively	2022-01-26	1	-6/+47
\| \| \| \| \| \| \|	When building isorecursive types, validate their relationships according to the rules described in https://github.com/WebAssembly/gc/pull/243. Specifically, supertypes must be declared before their subtypes to statically prevent cycles and child types must be declared either before or in the same recursion group as their parents.
*	Make `TypeBuilder::build()` fallible (#4474)	Thomas Lively	2022-01-25	1	-1/+61
\| \| \| \| \| \| \| \| \| \| \|	It is possible for type building to fail, for example if the declared nominal supertypes form a cycle or are structurally invalid. Previously we would report a fatal error and kill the program from inside `TypeBuilder::build()` in these situations, but this handles errors at the wrong layer of the code base and is inconvenient for testing the error cases. In preparation for testing the new error cases introduced by isorecursive typing, make type building fallible and add new tests for existing error cases. Also fix supertype cycle detection, which it turns out did not work correctly.
*	Create `ParentIndexIterator` to reduce iterator boilerplate (#4469)	Thomas Lively	2022-01-21	1	-3/+65
\| \| \| \| \| \| \|	Add a utility class for defining all the common operations like pre- and post- increment and decrement, addition and subtraction, and assigning addition and subtraction for iterators that are comprised of a parent object and an index into that parent object. Use the new utility to reduce the boilerplate in wasm-type.h. Add a new test of the iterator behavior.
*	Reset global type state between tests (#4468)	Thomas Lively	2022-01-20	1	-2/+19
\| \| \| \| \|	Add a `destroyAllTypes` function to clear the global state of the type system and use it in a custom gtest test fixture to ensure that each test starts and ends with a fresh state.
*	Introduce gtest (#4466)	Thomas Lively	2022-01-20	2	-0/+72
	Add gtest as a git submodule in third_party and integrate it into the build the same way WABT does. Adds a new executable, `binaryen-unittests`, to execute `gtest_main`. As a nontrivial example test, port one of the `TypeBuilder` tests from example/ to gtest/. Using gtest has a number of advantages over the current example tests: - Tests are compiled and linked at build time rather than runtime, surfacing errors earlier and speeding up test execution. - Tests are all built into a single binary, reducing overall link time and further reducing test overhead. - Tests are built from the same CMake project as the rest of Binaryen, so compiler settings (e.g. sanitizers) are applied uniformly rather than having to be separately set via the COMPILER_FLAGS environment variable. - Using the industry-standard gtest rather than our own script reduces our maintenance burden. Using gtest will lower the barrier to writing C++ tests and will hopefully lead to us having more proper unit tests.