summaryrefslogtreecommitdiff
path: root/test/lit/passes
Commit message (Collapse)AuthorAgeFilesLines
* [GC] Refinalize after selectify in RemoveUnusedBrs (#7104)Alon Zakai2024-11-251-0/+29
| | | | Replacing an if with a select may have refined the type. Without this fix, the sharper stale type checks complain.
* Propagate public visibility through all types (#7105)Thomas Lively2024-11-211-0/+55
| | | | | | | | | | | | | | Previously the classification of public types propagated public visibility only through types that had previously been collected by `collectHeapTypes`. Since there are settings that cause `collectHeapTypes` to collect fewer types, it was possible for public types to be missed if they were only public because they were reached by an uncollected types. Ensure that all public heap types are properly classified by propagating public visibility even through types that are not part of the collected output. Fixes #7103.
* Make validation of stale types stricter (#7097)Thomas Lively2024-11-216-33/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We previously allowed valid expressions to have stale types as long as those stale types were supertypes of the most precise possible types for the expressions. Allowing stale types like this could mask bugs where we failed to propagate precise type information, though. Make validation stricter by requiring all expressions except for control flow structures to have the most precise possible types. Control flow structures are exempt because many passes that can refine types wrap the refined expressions in blocks with the old type to avoid the need for refinalization. This pattern would be broken and we would need to refinalize more frequently without this exception for control flow structures. Now that all non-control flow expressions must have precise types, remove functionality relating to building select instructions with non-precise types. Since finalization of selects now always calculates a LUB rather than using a provided type, remove the type parameter from BinaryenSelect in the C and JS APIs. Now that stale types are no longer valid, fix a bug in TypeSSA where it failed to refinalize module-level code. This bug previously would not have caused problems on its own, but the stale types could cause problems for later runs of Unsubtyping. Now the stale types would cause TypeSSA output to fail validation. Also fix a bug where Builder::replaceWithIdenticalType was in fact replacing with refined types. Fixes #7087.
* Improve fuzzing of both closed and open world styles of modules (#7090)Alon Zakai2024-11-191-0/+237
| | | | | | | | | | Before, we would simply not export a function that had an e.g. anyref param. As a result, the modules were effectively "closed", which was good for testing full closed-world mode, but not for testing degrees of open world. To improve that, this PR allows the fuzzer to export such functions, and an "enclose world" pass is added that "closes" the wasm (makes it more compatible with closed-world) that is run 50% of the time, giving us coverage of both styles.
* Add nontrapping-fptoint lowering pass (#7016)Derek Schuff2024-11-191-0/+208
| | | | | | | | | | | This pass lowers nontrapping FP to int instructions to implement LLVM's conversion behavior. This means that they are not fully complete lowerings according to the wasm spec, but have the same undefined behavior that LLM does. This keeps the pass simpler and preserves existing behavior when compiling without nontrapping-ft. This will be used in emscripten, so that we can build libraries with nontrapping-fp and lower them away after link if desired.
* Use hints when generating fresh labels in IRBuilder (#7086)Thomas Lively2024-11-181-2/+2
| | | | | | | | | | | | IRBuilder often has to generate new label names for blocks and other scopes. Previously it would generate each new name by starting with "block" or "label" and incrementing a suffix until finding a fresh name, but this made name generation quadratic in the number of names to generate. To spend less time generating names, track a hint index at which to start looking for a fresh name and increment it every time a name is generated. This speeds up a version of the binary parser that uses IRBuilder by about 15%.
* Rename memory-copy-fill-lowering pass (#7082)Derek Schuff2024-11-161-1/+1
| | | | Since the resulting code has the same undefined behavior as LLVM, make the pass name reflect that.
* Use empty blocks instead of nops for empty scopes in IRBuilder (#7080)Thomas Lively2024-11-1474-285/+44
| | | | | | | | | | When IRBuilder builds an empty non-block scope such as a function body, an if arm, a try block, etc, it needs to produce some expression to represent the empty contents. Previously it produced a nop, but change it to produce an empty block instead. The binary writer and printer have special logic to elide empty blocks, so this produces smaller output. Update J2CLOpts to recognize functions containing empty blocks as trivial to avoid regressing one of its tests.
* Update lit test output (#7077)Thomas Lively2024-11-141-0/+23
| | | | heap-store-optimization.wast had a test without its accompanying generated output.
* [SignExt] OptimizeInstructions: Remove signexts of already-extended values ↵Alon Zakai2024-11-131-0/+367
| | | | (#7072)
* Introduce pass to lower memory.copy and memory.fill (#7021)Derek Schuff2024-11-131-0/+168
| | | | | | | | This pass lowers away memory.copy and memory.fill operations. It generates a function that implements the each of the instructions and replaces the instructions with calls to those functions. It does not handle other bulk memory operations (e.g. passive segments and table operations) because they are not used by emscripten to enable targeting old browsers that don't support bulk memory.
* HeapStoreOptimization: Fix a bug with jumping from the later value (v2) (#7070)Alon Zakai2024-11-121-16/+797
| | | | | | | | | | | | | | | | | | | | | | | | | | This PR fixes this situation: (block $out (local.set $x (struct.new X Y Z)) (struct.set $X 0 (local.get $x) (..br $out..)) ;; X' here has a br ) (local.get $x) => (block $out (local.set $x (struct.new (..br $out..) Y Z)) ) (local.get $x) We want to fold the struct.set into the struct.new, but the br is a problem: if it executes then we skip the struct.set, and the last local.get in fact reads the struct before the write. And, if we did this optimization, we'd end up with the br on the struct.new, so it would skip that instruction and even the local.set. To fix this, we use the new API from #7039, which lets us query, "is it ok to move the local.set to where the struct.set is?"
* Fix PickLoadSigns on SignExt feature instructions (#7069)Alon Zakai2024-11-111-0/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I believe the history here is that 1. We added a PickLoadSigns pass. It checks if a load from memory is stored in a local that is only every used in a signed or an unsigned manner. If it is, we can adjust the sign of the load (load8_u/s) to do the sign/unsign during the load. 2. The pass finds each LocalGet and looks either 2 or 3 parents above it. For a sign operation, we need to look up 3, since the operation is x << K >> K. For an unsigned, we need only 2, since we have x & M. We hardcoded those numbers 2 and 3. 3. We added the SignExt feature, which adds i32.extend8_s. This does a sign extend with a single instruction, not two nested ones, so now we can sign- extend at depth 2, unlike before. Properties::getSignExtValue was updated for this, but not the pass PickLoadSigns. The bug that is fixed here is that we looked at depth 3 for a sign-extend, and we blindly accepted it if we found one. So we ended up accepting (i32.extend8_s (ANYTHING (x))), which is a sign-extend of something, but not of x, which is bad. We were also missing an optimization opportunity, as we didn't look for depth 2 sign extends. This bug is quite old, from when Properties got SignExt support, in #3910. But the blame isn't there - to notice this then, we'd have had to check each caller of getSignExtValue throughout the codebase, which isn't reasonable. The fault is mine, from the first write-up of PickLoadSigns in 2017: the code should have been fully general, handling 2/3 and checking the output when it does so (adding == curr, that the sign/zero-extended value is the one we expect). That is what this PR does.
* [wasm64] Fix Directize on indexes > 32 bits (#7063)Alon Zakai2024-11-071-0/+14
|
* [wasm64] Handle 64-bit overflow in optimizeMemoryAccess (#7057)Alon Zakai2024-11-061-0/+31
| | | | | When we combine a load/store offset with a const, we must not overflow, as the semantics of offsets do not wrap.
* [GC] Fix ConstantFieldPropagation on incompatible types (#7054)Alon Zakai2024-11-051-0/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | CFP is less precise than GUFA, in particular, when it flows around types then it does not consider what field it is flowing them to, and its core data structure is "if a struct.get is done on this type's field, what can be read?". To see the issue this PR fixes, assume we have A / \ B C Then if we see struct.set $C, we know that can be read by a struct.get $A (we can store a reference to a C in such a local/param/etc.), so we propagate the value of that set to A. And, in general, anything in A can appear in B (say, if we see a copy, a struct.set of struct.get that operates on types A, then one of the sides might be a B), so we propagate from A to B. But now we have propagated something from C to B, which might be of an incompatible type. This cannot cause runtime issues, as it just means we are propagating more than we should, and will end up with less-useful results. But it can break validation if no other value is possible but one with an incompatible type, as we'd replace a struct.get $B with a value that only makes sense for C. (The qualifier "no other value is possible" was added in the previous sentence because if another one is possible then we'd end up with too many values to infer anything, and not optimize at all, avoiding any error.)
* [GC] Fix GlobalTypeOptimization logic for public types handling (#7051)Alon Zakai2024-11-041-0/+324
| | | | | | | | | | | | | This fixes a regression from #7019. That PR fixed an error on situations with mixed public and private types, but it made us stop optimizing in valid cases, including cases with entirely private types. The specific regression was that we checked if we had an entry in the map of "can become immutable", and we thought that was enough. But we may have a private child type with a public parent, and still be able to optimize in the child if the field is not present in the parent. We also did not have exhaustive checking of all the states canBecomeImmutable can be, so add those + testing.
* [GC] Fix handling of public types in TypeRefining (#7037)Alon Zakai2024-10-291-0/+137
|
* [GC] RemoveUnusedBrs: Ensure refining of BrOnCast's castType does not ↵Alon Zakai2024-10-291-13/+71
| | | | | | | | unrefine the output (#7036) Paradoxically, when a BrOn's castType is refined, its own type (the type it flows out) can get un-refined: making the castType non-nullable means nulls no longer flow on the branch, so they may flow out directly, making the BrOn nullable.
* Fix TypeMerging bug with indirectly reachable public types (#7031)Thomas Lively2024-10-241-0/+19
| | | | | | | | | | | | | | | | | | | TypeMerging works by representing the type definition graph as a partitioned DFA and then refining the partitions to find mergeable types. #7023 was due to a bug where the DFA included edges from public types to their children, but did not necessarily include corresponding states for those children. One way to fix the bug would have been to traverse the type graph, finding all reachable public types and creating DFA states for them, but that might be expensive in cases where there are large graphs of public types. Instead, fix the problem by removing the edges from public types to their children entirely. Types reachable from public types are also public and therefore are not eligible to be merged, so these edges were never necessary for correctness. Fixes #7023.
* [GC] Fix assertion in GlobalTypeOptimization about public super (#7026)Alon Zakai2024-10-221-0/+77
| | | | | | | | We only checked for the case of the immediate super being public while we are private, but it might be a grandsuper instead. That is, any ancestor that is public will prevent GTO from removing a field (since we can only add fields on top of our ancestors). Also, the ancestors might not all have the field, which would add more complexity to that particular assertion, so just remove it, and add comprehensive tests.
* Remove closed world validation checks (#7019)Alon Zakai2024-10-182-0/+88
| | | | | | | | | | | These were added to avoid common problems with closed world mode, but in practice they are causing more harm than good, forcing users to work around them. In the meantime (until #6965), remove this validation to unblock current toolchain makers. Fix GlobalTypeOptimization and AbstractTypeRefining on issues that this uncovers: without this validation, it is possible to run them on more wasm files than before, hence these were not previously detected. They are bundled in this PR because their tests cannot validate before this PR.
* [GC] Ignore public types in SignaturePruning (#7018)Alon Zakai2024-10-181-14/+72
| | | | | | | | | | Similar to #7017 . As with that PR, this reduces some optimizations that were valid, as we tried to do something complex here and refine types in a public rec group when it seemed safe to do so, but our analysis was incomplete. The testcase here shows how another operation can end up causing a dependency that breaks things, if another type that uses one that we modify is public. To be safe, ignore all public types. In the future perhaps we can find a good way to handle "almost-private" types in public rec groups, in closed world.
* [GC] Ignore public types in SignatureRefining (#7022)Alon Zakai2024-10-181-0/+40
| | | Similar to #7017 and #7018
* [EH] Add TryTable to StripEH (#7020)Alon Zakai2024-10-181-0/+35
|
* [GC] Ignore public types in GlobalTypeOptimization (#7017)Alon Zakai2024-10-171-0/+90
| | | | | | TypeUpdater which it uses internally already does so, but we must also ignore such types earlier, and make no other modifications to them. Helps #7015
* [EH][GC] Send a non-nullable exnref from TryTable (#7013)Alon Zakai2024-10-172-71/+71
| | | | | | | | | | | | | | | When EH+GC are enabled then wasm has non-nullable types, and the sent exnref should be non-nullable. In BinaryenIR we use the non- nullable type all the time, which we also do for function references and other things; we lower it if GC is not enabled to a nullable type for the binary format (see `WasmBinaryWriter::writeType`, to which comments were added in this PR). That is, this PR makes us handle exnref the same as those other types. A new test verifies that behavior. Various existing tests are updated because ReFinalize will now use the more refined type, so this is an optimization. It is also a bugfix as in #6987 we started to emit the refined form in the fuzzer, and this PR makes us handle it properly in validation and ReFinalization.
* [EH][GC] Add missing subtyping constraints from TryTable (#7012)Alon Zakai2024-10-161-0/+36
| | | | | Similar to Break, BrOn, etc., we must apply subtyping constraints of the types we send to blocks, so that Unsubtyping will not remove subtypings that are actually needed.
* GlobalRefining: Do not refine mutable exported globals (#7007)Alon Zakai2024-10-151-25/+31
| | | | | A mutable exported global might be shared with another module which writes to it using the current type, which is unsafe and the type system does not allow, so do not refine there.
* [Strings] StringGathering: Handle uses of strings before their definitions ↵Alon Zakai2024-10-151-11/+76
| | | | | | | | | | (#7008) When we gather strings, we create new globals for each one, that is then the canonical defining global for it, which will then be used everywhere else. We create such a global if we lack one, but if we happen to have such a global - a global that simply defines a string - then we reuse it. But we didn't handle the case where there was a use before the definition, and failed to sort the definition before the use.
* [WasmGC] OptimizeInstructions: Cancel out internalize+externalize pairs (#7005)Alon Zakai2024-10-141-0/+26
|
* [Wasm EH] Optimize away _ref from try_table catches when unused (#6996)Alon Zakai2024-10-141-0/+215
| | | | | | | | | | | | | If we have (drop (block $b (result exnref) (try_table (catch_all_ref $b) then we don't really need to send the ref: it is dropped, so we can just replace catch_all_ref with catch_all and then remove the drop and the block value. MergeBlocks already had logic to remove block values, so it is the natural place to add this.
* [WasmGC] OptimizeInstructions: Reorder externalize/internalize operations ↵Alon Zakai2024-10-141-5/+34
| | | | | | | | | | with ref.as_non_null (#7004) (any.convert_extern/extern.convert_any (ref.as_non_null ..)) => (ref.as_non_null (any.convert_extern/extern.convert_any ..)) This then allows the RefAsNonNull to be combined with parents in some cases (whereas the reverse allows nothing).
* [Wasm EH] Optimize values flowing out of TryTable (#6997)Alon Zakai2024-10-101-21/+47
| | | | | | | | | | | | | | | | | | | | | | This allows (block $out (result i32) (try_table (catch..) .. (br $out (i32.const 42) ) ) ) => (block $out (result i32) (try_table (result i32) (catch..) ;; add a result .. (i32.const 42) ;; remove the br around the value ) )
* ReFinalize in MergeBlocks so we can optimize unreachable instructions too ↵Alon Zakai2024-10-102-11/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | (#6994) In #6984 we optimized dropped blocks even if they had unreachable code. In #6988 that part was reverted, and blocks with unreachable code were ignored once more. However, I realized that the check was not actually for unreachable code, but for having an unreachable child, so it would miss things like this: (block (block .. (br $somewhere) ;; unreachable type, but no unreachable code ) ) But it is useful to merge such blocks: we don't need the inner block here. To fix this, just run ReFinalize if we change anything, which will propagate unreachability as needed. I think MergeBlocks was written before we had that utility, so it didn't use it... This is not only useful for itself but will unblock an EH optimization in a later PR, that has code in this form. It also simplifies the code by removing the hasUnreachableChild checks.
* Fix BranchUtils::operateOnScopeNameUsesAndSentValues() on BrOn (#6995)Alon Zakai2024-10-101-9/+54
| | | | | BrOn does not always send a value. This is an odd asymmetry in the wasm spec, where br_on_null does not send the null on the branch (which makes sense, but the asymmetry does mean we need to special-case it).
* Fix flow reset during throw => break opts in RemoveUnusedBrs (#6993)Alon Zakai2024-10-081-0/+40
| | | | | | | | #6980 was missing the logic to reset flows after replacing a throw. The process of replacing the throw introduces new code and in particular a drop, which blocks branches from flowing to their targets. In the testcase here, the br was turned into nop before this fix.
* Fix a misoptimization with mixed Try/TryTable in RemoveUnusedBrs (#6991)Alon Zakai2024-10-071-0/+34
| | | | We ignored legacy Trys in #6980, but they can also catch.
* Fix a fuzz issue with #6984 (#6988)Alon Zakai2024-10-072-5/+11
| | | | | | | | | | | | | | When I refactored the optimizeDroppedBlock logic in #6982, I didn't move the unreachability check with that code, which was wrong. When that function was called from another place in #6984, the fuzzer found an issue. Diff without whitespace is smaller. This reverts almost all the test updates from #6984 - those changes were on blocks with unreachable children. The change was safe on them, but in general removing a block value in the presence of unreachable code is tricky, so it's best to avoid it. The testcase is a little bizarre, but it's the one the fuzzer found and I can't find a way to generate a better one (other than to reduce it, which I did).
* MergeBlocks: Optimize all dropped blocks (#6984)Alon Zakai2024-10-043-12/+34
| | | | | | Just call optimizeDroppedBlock from visitDrop to handle that. Followup to #6982. This optimizes the new testcase added there. Some older tests also improve.
* RemoveUnusedBrs: Generalize jump threading optimizations to all branches (#6983)Alon Zakai2024-10-043-1/+81
| | | | | | | | This change is NFC on all things we previously optimized, but also makes us optimize TryTable, BrOn, etc., by replacing hard-coded logic for Break with generic code. Also simplify the code there a little - we didn't really need ControlFlowWalker.
* [NFC] Refactor out the dropped-block optimization code in MergeBlocks (#6982)Alon Zakai2024-10-031-3/+24
| | | | | | | | This just moves the code out into a function. A later PR will use it in another place. Add a test that shows the motivation for that later PR: we fail to optimize away a block return value at the top level of a function. Fixing that will involve calling the new function here from another place.
* [Wasm EH] Optimize throws caught by TryTable into breaks (#6980)Alon Zakai2024-10-032-2/+328
| | | | | | | | | | | | | | | | E.g. (try_table (catch_all $catch) (throw $e) ) => (try_table (catch_all $catch) (br $catch) ) This can then allow other passes to remove the TryTable, if no throwing things remain.
* Source Maps: Support 5 segment mappings (#6795)Ömer Sinan Ağacan2024-10-011-0/+36
| | | | | | | Support 5-segment source mappings, which add a name. Reference: https://github.com/tc39/source-map/blob/main/source-map-rev3.md#proposed-format
* Fix the type of reused RefFunc in Precompute (#6976)Alon Zakai2024-09-301-3/+32
| | | | | | | | | | | When we precompute something, we try to avoid allocating a new copy. That's important to avoid many allocations each time we run Precompute - otherwise, each time we see a br we'd allocate a fresh one, and for its values. But we had a bug where we reused a RefFunc as the value of a br without updating the type. It's actually tricky to reach a situation where we find a RefFunc to reuse and it is different from the actual one we want, but the fuzzer found one. Fixes the fuzz bug reported on #6845 (but unrelated to that PR).
* [NFC-ish] Stop creating unneeded blocks around calls when inlining (#6969)Alon Zakai2024-09-269-1760/+1430
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inlining was careful about nested calls like this: (call $a (call $b) ) If we inlined the outer call first, we'd have (block $inlined-code-from-a ..code.. (call $b) ) After that, the inner call is a child of a block, not of a call. That is, we've moved the inner call to another parent. To replace that inner call when we inline, we'd need to update the new parent, which would require work. To avoid that work, the pass simply created a block in the middle: (call $a (block (call $b) ) ) Now the inner call's immediate parent will not change when we inline the outer call. However, it turns out that this was entirely unnecessary. We find the calls using a post-order traversal, and we store the actions in a vector that we traverse in order, so we only ever process things in the optimal order of children before parents. And in that order there is no problem: inlining the inner call first leads to (call $a (block $inlined-code-from-b (..code..) ) ) That does not affect the outer call's parent. This PR removes the creation of the unnecessary blocks. This doesn't improve the final output as optimizations remove the unneeded blocks later anyhow, but it does make the code simpler and a little faster. It also makes debugging less confusing. But this is not truly NFC because --inlining (but not --inlining-optimizing) will actually emit fewer blocks now (but only --inlining-optimizing is used by default in production). The diff on tests here is very small when ignoring whitespace. The remaining differences are just emitting fewer obviously-unneeded blocks. There is also one test that needed manual changes, inlining-eh-legacy, because it tested that we do Pop fixups, but after emitting one fewer block, those fixups were not needed. I added a new test there with two nested calls, which does end up needing those fixups. I also added such a test in inlining_all-features so that we have coverage for such nested calls (we might remove the eh-legacy file some day, and other existing tests with nested calls that I found were more complex).
* [NFC-ish] Avoid repeated ReFinalize etc. when inlining (#6967)Alon Zakai2024-09-242-25/+24
| | | | | | | | | | | | | | | | | | | | | | | | | We may inline multiple times into a single function. Previously, if we did so, we did the "fixups" such as ReFinalize and non-nullable local fixes once per such inlining. But that is wasteful as each ReFinalize etc. scans the whole function, and could be done after we copy all the code from all the inlinings, which is what this PR does: it splits doInlining() into one function that inlines code and one that does the updates after, and the update is done after all inlinings. This turns out to be very important, a 5x speedup on two large real-world wasm files I am looking at. The reason is that we actually inline more than once in half the cases, and sometimes far more - in one case we inline over 1,000 times into a function! (and ReFinalized 1,000 times too many) This is practically NFC, but it turns out that there are some tiny noticeable differences between running ReFinalize once at the end vs. once after each inlining. These differences are not really functional or observable in the behavior of the code, and optimizations would remove them anyhow, but they are noticeable in two tests here. The changes to tests are, in order: * Different block names, just because the counter we use sees more things. * In a testcase with unreachable code, we inline twice into a function, and the first inlining brings in an unreachable, and ReFinalizing early will lead to it propagating differently than if we wait to ReFinalize. (It actually leads to another cycle of inlining in that case, as a fluke.)
* Improve types for null accesses and remove hacks (#6954)Thomas Lively2024-09-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a struct.get or array.get is optimized to have a null reference operand, its return type loses meaning since the operation will always trap. Previously when refinalizing such expressions, we just left their return type unchanged since there was no longer an associated struct or array type to calculate it from. However, this could lead to a strange setup where the stale return type was the last remaining use of some heap type in the module. That heap type would never be emitted in the binary, but it was still used in the IR, so type optimizations would have to keep updating it. Our type collecting logic went out of its way to include the return types of struct.get and array.get expressions to account for this strange possibility, even though it otherwise collected only types that would appear in binaries. In principle, all of this should have applied to `call_ref` as well, but the type collection logic did not have the necessary special case, so there was probably a latent bug there. Get rid of these special cases in the type collection logic and make it impossible for the IR to use a stale type that no longer appears in the binary by updating such stale types during finalization. One possibility would have been to make the return types of null accessors unreachable, but this violates the usual invariant that unreachable instructions must either have unreachable children or be branches or `(unreachable)`. Instead, refine the return types to be uninhabitable non-nullable references to bottom, which is nearly as good as refining them directly to unreachable. We can consider refining them to `unreachable` in the future, but another problem with that is that it would currently allow the parsers to admit more invalid modules with arbitrary junk after null accessor instructions.
* Fix selects of packed fields in GlobalStructOptimization (#6947)Alon Zakai2024-09-171-0/+49
| | | | | We emit a select between two objects when only two objects exist of a particular type. However, if the field is packed, we did not handle truncating the written values.
* Require string-style identifiers to be UTF-8 (#6941)Thomas Lively2024-09-163-32/+38
| | | | | | | | | | | In the WebAssembly text format, strings can generally be arbitrary bytes, but identifiers must be valid UTF-8. Check for UTF-8 validity when parsing string-style identifiers in the lexer. Update StringLowering to generate valid UTF-8 global names even for strings that may not be valid UTF-8 and test that text round tripping works correctly after StringLowering. Fixes #6937.