forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Outlining] Filter branches & returns (#6024)	Ashley Nelson	2023-10-18	1	-56/+95
\| \| \|	Adds a function to filter branch and return instructions from being included in potential outlining candidates.
*	[Outlining] Filter Local Set (#6018)	Ashley Nelson	2023-10-18	1	-1/+42
\| \| \| \| \|	Adds a general purpose walker named FilterStringifyWalker, intended to walk control flow and take note of whether any of the expressions satisfy the condition. Also includes an << overload for SuffixTree::RepeatedSubstring to make debugging easier.
*	Added type to fix template deduction failure (#6014)	Ashley Nelson	2023-10-17	1	-2/+1
\| \| \|	Fixes #6003
*	Add getGeneralSuperType() that includes basic supers, and use in fuzzer (#6005)	Alon Zakai	2023-10-17	2	-3/+77
\| \| \| \|	With this, the fuzzer can replace e.g. an eq expression with a specific struct type, because now it is away that struct types have eq as their ancestor.
*	[NFC] Rename getSuperType to getDeclaredSuperType (#6015)	Alon Zakai	2023-10-17	1	-3/+3
\| \| \| \|	A later PR will add getSuperType which will mean "get the general super type - either declared, or not".
*	[Outlining] Adds separator context (#5977)	Ashley Nelson	2023-10-04	1	-39/+67
\| \| \|	Adds a std::variant to represent the context of why a unique symbol was inserted in the stringified module. This allows us to pass necessary contextual data to subclasses of StringifyWalker in a structured manner.
*	[Outlining] Adds SuffixTree::RepeatSubstring dedupe test (#5972)	Ashley Nelson	2023-10-04	1	-0/+17
\| \| \|	This PR adds a StringProcessor struct intended to hold functions that filter vectors of SuffixTree::RepeatedSubstring, and a test of its first functionality, removing overlapping repeated substrings.
*	Outlining Test Improvements (#5971)	Ashley Nelson	2023-09-25	1	-13/+21
\| \| \|	Abstracts repeat code
*	[NFC] Split the new wat parser into multiple files (#5960)	Thomas Lively	2023-09-19	1	-1/+1
\| \| \| \| \| \|	And put the new files in a new source directory, "parser". This is a rough split and is not yet expected to dramatically improve compile times. The exact organization of the new files is subject to change, but this splitting should be enough to make further parser development more pleasant.
*	Remove legacy type defintion text syntax (#5948)	Thomas Lively	2023-09-18	1	-7/+7
\| \| \| \| \| \| \|	Remove support for the "struct_subtype", "array_subtype", "func_subtype", and "extends" notations we used at various times to declare WasmGC types, leaving only support for the standard text fromat for declaring types. Update all the tests using the old formats and delete tests that existed solely to test the old formats.
*	Make final types the default (#5918)	Thomas Lively	2023-09-09	2	-68/+53
\| \| \| \| \| \| \| \| \| \| \| \| \|	Match the spec and parse the shorthand binary and text formats as final and emit final types without supertypes using the shorthands as well. This is a potentially-breaking change, since the text and binary shorthands can no longer be used to define types that have subtypes. Also make TypeBuilder entries final by default to better match the spec and update the internal APIs to use the "open" terminology rather than "final" terminology. Future changes will update the text format to use the standard "sub open" rather than the current "sub final" keywords. The exception is the new wat parser, which supporst "sub open" as of this change, since it didn't support final types at all previously.
*	Lattice to model Stack (#5849)	Bruce He	2023-08-03	1	-0/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change introduces StackLattice, a lattice to model stack-related behavior. It is templated on a separate lattice whose elements model some property of individual values on the stack. The StackLattice allows users to access the top of the stack, push abstract values, and pop them. Comparisons and least upper bound operations are done on a value by value basis starting from the top of the stack and moving toward the bottom. This is because it allows stacks from different scopes to be joined easily. An application of StackLattice is to model the wasm value stack. The goal is to organize lattice elements representing individual stack values in a natural way which mirrors the wasm value stack. Transfer functions operate on each stack value individually. The stack lattice is an intermediate structure which is not intended to be directly operated on. Rather, it simulates the push and pop behavior of instructions.
*	[Outlining] Integration test for suffix_tree and stringify (#5839)	Ashley Nelson	2023-08-03	1	-3/+40
\| \| \|	Adds an integration test that identifies the substrings of a stringified wasm module using the suffix_tree.
*	[NFC] Move ModuleUtils copying and renaming logic from header to cpp (#5855)	Alon Zakai	2023-08-02	1	-0/+1
\| \| \| \| \| \| \| \|	None of that code is speed-sensitive, or at least doesn't need to be inlined to be fast. Move it to cpp for faster compile times. This caused a cascade of necessary header fixes (i.e. after removing unneeded header inclusions in module-utils.h, files that improperly depended on that stopped working and needed an added include).
*	GUFA: Infer using TrapsNeverHappen (#5850)	Alon Zakai	2023-08-02	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a TrapsNeverHappen oracle that is used inside the main PossibleContents oracle of GUFA. The idea is that when traps never happen we can reason "backwards" from information to things that must be true before it: temp = x.field; x.cast_to<Y>(); // Y is a subtype of x's type X Here we cast x to a subtype. If we assume traps never happen then the cast must succeed, and that means we can assume we had a Y on the previous line, where perhaps that information lets us infer the value of x.field. This PR focuses on calls, which are the more interesting situation to optimize because other passes do some work already inside functions. Specifically, we look for things that will trap in the called function or the caller, such as if the called function always casts a param to some type, we can assume the caller passes such a type in. And if we have a call_ref then any target that would trap cannot be called (at least in a closed world). This has some benefits, in particular when combined with --gufa-cast-all since that casts more things, which lets us apply the inferences made here. I see 3.3% fewer call_ref instructions on a Kotlin testcase, for example. This helps more on -Os when we inline less.
*	PossibleContents: Support more intersection types (#5847)	Alon Zakai	2023-07-31	1	-7/+21
\|
*	Convert lattice compare function to non-static (#5848)	Bruce He	2023-07-28	1	-3/+2
\| \| \| \| \|	Changes the static asserts checking a lattice type to require a non-static compare function instead of a static one. Also changes the compare functions of existing lattices to be non-static.
*	End the current basic block on a Call (#5823)	Alon Zakai	2023-07-26	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	Before this PR, if a call had no paths to a catch in the same function then we skipped creating a new basic block right after it. As a result, we could have a call in the middle of a basic block. If EH is enabled that means we might transfer control flow out of the function from the middle of a block. But it is better to have the property that any transfer of control flow - to another basic block, or outside of the function - can only happen at the end of a basic block. This causes some overhead, but a subsequent PR (#5838) will remove that as a followup, and this PR adds a little code to pass the module and check if EH is enabled, and avoid the overhead if not, which at least avoids regressing the non-EH case until that followup lands.
*	[Outlining] Changing stringify values to 32-bit (#5832)	Ashley Nelson	2023-07-21	1	-45/+45
\| \| \| \| \|	The LLVM suffix tree expects to be provided with a vector of 32-bit unsigned integers. This PR makes it easier to integrate our wasm program string with the suffix tree. Because the range of possible values is reduced from 2^64 to 2^32, a signed integer was added to manage the next separator value and ensure we're using every possible negative number.
*	Add support for debug printing of functions (#5828)	Alon Zakai	2023-07-20	2	-0/+45
\|
*	Static Analysis: Add an API to get the block index of an expression (#5822)	Alon Zakai	2023-07-20	1	-0/+40
\|
*	Reaching Definitions Analysis for LocalGraph (#5817)	Bruce He	2023-07-19	1	-0/+205
\| \| \| \| \| \| \| \| \| \| \|	This change implements a reaching definitions analysis which is intended to be equivalent to the information provided by LocalGraph, specifically the Flower class of LocalGraph. It also introduces a CRTP utility in visitor-transfer-function.h which implements most commonly found visitor-type transfer function functionalities. The MonotoneCFGAnalyzer is also modified to add a phase to collect results after the analysis is solved from the final CFG states.
*	[Outlining] LLVM Suffix Tree (#5821)	Ashley Nelson	2023-07-18	2	-0/+156
\| \| \| \| \|	This PR adds LLVM's suffix tree data structure to Binaryen. This suffix tree is implemented using Ukkonen's algorithm for linear-time suffix tree construction, and is intended for fast substring queries. Note: All of the .h and .cpp files included are from LLVM. These files were copied directly instead of imported into our existing LLVM integration (in third_party/) to avoid bumping the commit hash and avoid the potential for complications with upstream changes.
*	[Outlining] HashStringifyWalker (#5810)	Ashley Nelson	2023-07-13	1	-2/+93
\| \| \| \| \|	This PR is part of the effort to bring Outlining to Binaryen. HashStringifyWalker is a subclass of StringifyWalker #5772, and used to encode a wasm module as a "string". Our "string" is a vector and each symbol is a uint64_t, providing us with a capacity of 2^64 symbols.
*	Generalize Finite Powerset Lattice to Any Given Type (#5800)	Bruce He	2023-07-06	1	-6/+47
\| \| \| \| \| \| \| \| \| \|	This change creates a lattice, FinitePowersetLattice, to represent finite powerset lattices constructed from sets containing members of arbitrary type The bitvector finite powerset lattice class is renamed FiniteIntPowersetLattice. The templated FinitePowersetLattice class contains a FiniteIntPowersetLattice object, and over that provides functionality to map lattice element members to bitvector indices. Methods are also provided by the lattice to add or remove members of the given type from lattice members as an abstraction over flipping bits in the bitvector.
*	Make the Transfer Function a Generic, Interchangeable Type for the Monotone ↵	Bruce He	2023-07-06	1	-19/+19
\| \| \| \| \| \| \|	Analyzer (#5794) This change makes the transfer function an object separate from the monotone analyzer. The transfer function class type is a generic template of the monotone analyzer, and an instance of the transfer function is instantiated and then passed into the monotone analyzer via its constructor. The API of the transfer function is enforced by a static assertion. This change also introduces LivenessTransferFunction as a transfer function for liveness analysis as an example.
*	Initial support for `final` types (#5803)	Thomas Lively	2023-07-06	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement support in the type system for final types, which are not allowed to have any subtypes. Final types are syntactically different from similar non-final types, so type canonicalization is made aware of finality. Similarly, TypeMerging and TypeSSA are updated to work correctly in the presence of final types as well. Implement binary and text parsing and emitting of final types. Use the standard text format to represent final types and interpret the non-standard "struct_subtype" and friends as non-final. This allows a graceful upgrade path for users currently using the non-standard text format, where they can update their code to use final types correctly at the point when they update to use the standard format. Once users have migrated to using the fully expanded standard text format, we can update update Binaryen's parsers to interpret the MVP shorthands as final types to match the spec without breaking those users. To make it safe for V8 to independently start interpreting types declared without `sub` as final, also reserve that shorthand encoding only for types that have no strict subtypes.
*	[NFC] Fix the use of "strict" in subtypes.h (#5804)	Thomas Lively	2023-07-06	1	-3/+3
\| \| \| \| \| \| \| \|	Previously we incorrectly used "strict" to mean the immediate subtypes of a type, when in fact a strict subtype of a type is any subtype excluding the type itself. Rename the incorrect `getStrictSubTypes` to `getImmediateSubTypes`, rename the redundant `getAllStrictSubTypes` to `getStrictSubTypes`, and rename the redundant `getAllSubTypes` to `getSubTypes`. Fixing the capitalization of "SubType" to "Subtype" is left as future work.
*	[Outlining] StringifyWalker (#5772)	Ashley Nelson	2023-06-30	2	-0/+128
\| \| \|	StringifyWalker is a new Walker with UnifiedExpressionVisitor. This walker performs a shallow visit of control-flow (try, if, block, loop) expressions and their simple expression siblings before then visiting the children of each control-flow expression in postorder. As a result, this walker un-nests nested control flow structures, so the expression visit order does not correspond to a normal postorder traversal of the function.
*	Dynamic Sized Bitvector Powerset Lattice for Static Analysis (#5784)	Bruce He	2023-06-29	1	-23/+35
\| \| \|	This change introduces a finite-height powerset lattice FinitePowersetLattice where each element's height is determined when constructed in runtime. This is implemented using a vector of `bools. This change also modifies BlockState and MonotoneCFGAnalyzer to be templated on a generic lattice type instead of using a hardcoded lattice. It additionally fixes an iterator bug in MonotoneCFGAnalyzer::fromCFG which assigned a temporary iterator object as predecessor and successor pointers to BlockStates instead of pointers to actual BlockState objects.
*	Liveness Analysis Proof of Concept (#5771)	Bruce He	2023-06-23	1	-0/+158
\| \| \|	This introduces a limited monotone flow-sensitive liveness analysis on local indices as an initial proof of concept for the creation of a monotone flow-sensitive static analysis framework. Tests are included in test/gtest/cfg.cpp.
*	Add a gtest fixture for tests checking printed output (#5781)	Thomas Lively	2023-06-23	2	-8/+33
\| \| \| \| \|	Such tests all have to handle disabling and re-enabling color output for printed wast and have to handle parsing input wast, so add a test fixture that makes these tasks easier.
*	[analysis] Add a new iterable CFG utility (#5712)	Thomas Lively	2023-05-12	2	-0/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new "analysis" source directory that will contain the source for a new static program analysis framework. To start the framework, add a CFG utility that provides convenient iterators for iterating through the basic blocks of the CFG as well as the predecessors, successors, and contents of each block. The new CFGs are constructed using the existing CFGWalker, but they are different in that the new utility is meant to provide a usable representation of a CFG whereas CFGWalker is meant to allow collecting arbitrary information about each basic block in a CFG. For testing and debugging purposes, add `print` methods to CFGs and basic blocks. This requires exposing the ability to print expression contents excluding children, which was something we previously did only for StackIR. Also add a new gtest file with a test for constructing and printing a CFG. The test reveals some strange properties of the current CFG construction, including empty blocks and strange placement of `loop` instructions, but fixing these problems is left as future work.
*	Remove the ability to construct basic types in a TypeBuilder (#5678)	Thomas Lively	2023-04-19	1	-22/+0
\| \| \| \| \| \| \| \| \| \| \|	This capability was originally introduced to support calculating LUBs in the equirecursive type system, but has not been needed for anything except tests since the equirecursive type system was removed. Since building basic heap types is no longer useful and was a source of significant complexity, remove the APIs that allowed it and the tests that used those APIs. Also remove test/example/type-builder.cpp, since a significant portion of it tested the removed APIs and the rest is already better tested in test/gtest/type-builder.cpp.
*	Remove the nominal type system (#5672)	Thomas Lively	2023-04-17	3	-97/+26
\| \| \| \| \|	And since the only type system left is the standard isorecursive type system, remove `TypeSystem` and its associated APIs entirely. Delete a few tests that only made sense under the isorecursive type system.
*	[GUFA] Refine global types during flow (#5639)	Alon Zakai	2023-04-07	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \|	Previously (ref.as_non_null (global.get ..)) would return the global with no changes, and if the global was nullable then the type didn't match the output, which hit an assertion (where GUFA checks that the contents match the declared type in the wasm). To fix this, refine global types, that is, the type we track on GlobalInfo may be more refined than the global itself. In the above example, if the global is nullable then the GlobalInfo would point to that global but have a non-nullable type. In fact the code was already prepared for this, and few changes were needed.
*	Add a TypeNameGenerator that uses names from a Module (#5437)	Thomas Lively	2023-01-18	1	-4/+42
\| \| \| \| \| \| \| \| \|	If the module does not have a name for a particular type, the new utility falls back to use a different user-configurable type name generator, just like the existing IndexedTypeNameGenerator does. Also change how heap types are printed by this printing machinery (which is currently only used for debugging) so that their names are printed in addition to their contents. This makes the printer much more useful for debugging.
*	Add Valmari-Lehtinen DFA minimization as a standalone utility (#5430)	Thomas Lively	2023-01-17	2	-0/+136
\| \| \| \| \| \| \| \|	We used to have this algorithm in wasm-type.cpp, where we used it to implement equirecursive type canonicalization, but we removed it when we removed equirecursive typing. Bring the algorithm back as a standalone utility for future use in optimization passes. In particular, it will be useful in TypeMerging for identifying the greatest fixed point of mergeable types rather than the smallest fixed point.
*	[Wasm GC] Replace `HeapType::data` with `HeapType::struct_` (#5416)	Thomas Lively	2023-01-10	2	-53/+54
\| \| \| \| \| \|	`struct` has replaced `data` in the upstream spec, so update Binaryen's types to match. We had already supported `struct` as an alias for data, but now remove support for `data` entirely. Also remove instructions like `ref.is_data` that are deprecated and do not make sense without a `data` type.
*	Replace more uses of `NAN` (#5354)	Thomas Lively	2022-12-15	1	-2/+2
\| \| \| \|	MSVC is making `NAN` negative, so use an explicitly constructed positive NaN instead.
*	Remove more uses of NAN (#5310)	Thomas Lively	2022-12-02	1	-8/+10
\| \| \| \| \|	In favor of the more portable code snippet using `std::copysign`. Also reintroduce assertions that the NaNs have the expected signs. This continues work started in #5302.
*	Remove equirecursive typing (#5240)	Thomas Lively	2022-11-23	2	-14/+8
\| \| \| \|	Equirecursive is no longer standards track and its implementation is extremely complex. Remove it.
*	Change the default type system to isorecursive (#5239)	Thomas Lively	2022-11-23	1	-1/+122
\| \| \| \| \| \| \| \| \| \|	This makes Binaryen's default type system match the WasmGC spec. Update the way type definitions without supertypes are printed to reduce the output diff for MVP tests that do not involve WasmGC. Also port some type-builder.cpp tests from test/example to test/gtest since they needed to be rewritten to work with isorecursive type anyway. A follow-on PR will remove equirecursive types completely.
*	Fix isorecursive canonicalization (#5269)	Thomas Lively	2022-11-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes a longstanding problem with isorecursive canonicalization that only showed up in MacOS and occasionally Windows builds. The problem was that `RecGroupEquator` was not quite correct in the presence of self-references in rec groups. Specifically, `RecGroupEquator` did not differentiate between instances of the same type appearing across two rec groups where the type was a self-reference in one group but not in the other. The reason this only showed up occasionally on some platforms was that this bug could only cause incorrect behavior if two groups that would incorrectly be compared as equal were hashed into the same bucket of a hash map. Apparently the hash map used on Linux never hashes the two problematic groups into the same bucket.
*	[NFC] Rewrite PossibleContents::combine to be static (#5192)	Alon Zakai	2022-10-28	1	-4/+11
\| \| \| \| \|	This makes the logic symmetric and easier to read. Measuring speed, this seems identical to before, so that concern seems fine.
*	[Wasm GC] Fix the depth of the new array heap type (#5186)	Alon Zakai	2022-10-28	2	-1/+62
\|
*	[Wasm GC] Filter GUFA expression locations by their type (#5149)	Alon Zakai	2022-10-18	1	-2/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have a cone type, we are able to represent in PossibleContents the natural content of a wasm location: a type or any of its subtypes. This allows us to enforce the wasm typing rules, that is, to filter the data arriving at a location by the wasm type of the location. Technically this could be unnecessary if we had full implementations of flowFoo and so forth, that is, tailored code for each wasm expression that makes sure we only contain and flow content that fits in the wasm type. Atm we don't have that, and until the wasm spec stabilizes it's probably not worth the effort. Instead, simply filter based on the type, which gives the same result (though it does take a little more work; I measured it at 3% or so of runtime). While doing so normalize cones to their actual maximum depth, which simplifies things and will help more later as well.
*	Implement `array` basic heap type (#5148)	Thomas Lively	2022-10-18	1	-2/+22
\| \| \| \| \| \| \| \| \|	`array` is the supertype of all defined array types and for now is a subtype of `data`. (Once `data` becomes `struct` this will no longer be true.) Update the binary and text parsing of `array.len` to ignore the obsolete type annotation and update the binary emitting to emit a zero in place of the old type annotation and the text printing to print an arbitrary heap type for the annotation. A follow-on PR will add support for the newer unannotated version of `array.len`.
*	Exhaustively test basic heap type relationships (#5147)	Thomas Lively	2022-10-17	1	-0/+199
\| \| \| \| \| \|	As the number of basic heap types has grown, the complexity of the subtype and LUB calculations has grown as well. To ensure that they are correct, test the complete matrix of basic types and trivial user-defined types. Fix the subtype calculation to make string types subtypes of `any` to make the test pass.
*	[Wasm GC] Add a getMaxDepths() helper for heap types (#5134)	Alon Zakai	2022-10-13	1	-0/+29
\| \| \| \| \| \|	This computes how deep the children of a heap type are. This will be useful in cone type optimizations, since we want to "normalize" cones: a cone of depth infinity can just be a cone of the actual maximum depth of existing children, etc., and it's simpler to have a single canonical representation to avoid extra work.