diff options
author | Thomas Lively <tlively@google.com> | 2022-10-11 11:16:14 -0500 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-10-11 16:16:14 +0000 |
commit | b83450ed1fd98cec4453024f57f892b31851ea50 (patch) | |
tree | bf0467d96c9966d0f4699ea0afcdf25905b4098c /src/support/name.h | |
parent | 6d4ac3162c290e32a98de349d49e26e904a40414 (diff) | |
download | binaryen-b83450ed1fd98cec4453024f57f892b31851ea50.tar.gz binaryen-b83450ed1fd98cec4453024f57f892b31851ea50.tar.bz2 binaryen-b83450ed1fd98cec4453024f57f892b31851ea50.zip |
Make `Name` a pointer, length pair (#5122)
With the goal of supporting null characters (i.e. zero bytes) in strings.
Rewrite the underlying interned `IString` to store a `std::string_view` rather
than a `const char*`, reduce the number of map lookups necessary to intern a
string, and present a more immutable interface.
Most importantly, replace the `c_str()` method that returned a `const char*`
with a `toString()` method that returns a `std::string`. This new method can
correctly handle strings containing null characters. A `const char*` can still
be had by calling `data()` on the `std::string_view`, although this usage should
be discouraged.
This change is NFC in spirit, although not in practice. It does not intend to
support any particular new functionality, but it is probably now possible to use
strings containing null characters in at least some cases. At least one parser
bug is also incidentally fixed. Follow-on PRs will explicitly support and test
strings containing nulls for particular use cases.
The C API still uses `const char*` to represent strings. As strings containing
nulls become better supported by the rest of Binaryen, this will no longer be
sufficient. Updating the C and JS APIs to use pointer, length pairs is left as
future work.
Diffstat (limited to 'src/support/name.h')
-rw-r--r-- | src/support/name.h | 30 |
1 files changed, 17 insertions, 13 deletions
diff --git a/src/support/name.h b/src/support/name.h index 615740e09..a22461d5d 100644 --- a/src/support/name.h +++ b/src/support/name.h @@ -17,9 +17,7 @@ #ifndef wasm_support_name_h #define wasm_support_name_h -#include <string> - -#include "emscripten-optimizer/istring.h" +#include "support/istring.h" namespace wasm { @@ -33,14 +31,19 @@ namespace wasm { // TODO: as an optimization, IString values < some threshold could be considered // numerical indices directly. -struct Name : public cashew::IString { - Name() : cashew::IString() {} - Name(const char* str) : cashew::IString(str, false) {} - Name(cashew::IString str) : cashew::IString(str) {} - Name(const std::string& str) : cashew::IString(str.c_str(), false) {} +struct Name : public IString { + Name() : IString() {} + Name(std::string_view str) : IString(str, false) {} + Name(const char* str) : IString(str, false) {} + Name(IString str) : IString(str) {} + Name(const std::string& str) : IString(str) {} + + // String literals do not need to be copied. Note: Not safe to construct from + // temporary char arrays! Take their address first. + template<size_t N> Name(const char (&str)[N]) : IString(str) {} friend std::ostream& operator<<(std::ostream& o, Name name) { - if (name.str) { + if (name) { return o << name.str; } else { return o << "(null Name)"; @@ -48,11 +51,12 @@ struct Name : public cashew::IString { } static Name fromInt(size_t i) { - return cashew::IString(std::to_string(i).c_str(), false); + return IString(std::to_string(i).c_str(), false); } - bool hasSubstring(cashew::IString substring) { - return strstr(c_str(), substring.c_str()) != nullptr; + bool hasSubstring(IString substring) { + // TODO: Use C++23 `contains`. + return str.find(substring.str) != std::string_view::npos; } }; @@ -60,7 +64,7 @@ struct Name : public cashew::IString { namespace std { -template<> struct hash<wasm::Name> : hash<cashew::IString> {}; +template<> struct hash<wasm::Name> : hash<wasm::IString> {}; } // namespace std |