gcc/libstdc++-v3/testsuite/ext/unicode/view.cc

140 lines
4.3 KiB
C++
Raw Permalink Normal View History

libstdc++: Add Unicode-aware width estimation for std::format This implements the requirements in the following proposals, which dictate how std::format deals with non-ASCII strings: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1868r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2572r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2675r1.pdf There are two parts to this. The width estimation for strings must only count the width of the first character in an extended grapheme cluster. That requires implementing the algorithm for detecting cluster breaks, which requires a number of lookup tables of the grapheme cluster break properties (and Indic_Conjunct_Break and Extended_Pictographic properties) of every code point. Additionally, some characters have a field width of 2, which requires another lookup table of field widths for every code point. The tables added in this commit do not contain entries for every code point from 0 to 0x10FFFF as that would be very inefficient and use too much memory. Instead the tables only contain the code points that form an "edge" for a property, omitting all the code points that have the same property as the preceding one. We can use a binary search to find the closest code point in the table that is not greater than the one we're looking for. The tables are generated by a new Python script added to the contrib/unicode directory, and a new data file downloaded from the Unicode Consortium website. The rules for extended grapheme cluster breaking are implemented for the latest Unicode standard, version 15.1.0. libstdc++-v3/ChangeLog: * include/Makefile.am: Add new headers. * include/Makefile.in: Regenerate. * include/bits/unicode.h: New file. * include/bits/unicode-data.h: New file. * include/std/format: Include <bits/unicode.h>. (__literal_encoding_is_utf8): Move to <bits/unicode.h>. (_Spec::_M_fill): Change type to char32_t. (_Spec::_M_parse_fill_and_align): Read a Unicode scalar value instead of a single character. (__write_padded): Change __fill_char parameter to char32_t and encode it into the output. (__formatter_str::format): Use new __unicode::__field_width and __unicode::__truncate functions. * include/std/ostream: Adjust namespace qualification for __literal_encoding_is_utf8. * include/std/print: Likewise. * src/c++23/print.cc: Add [[unlikely]] attribute to error path. * testsuite/ext/unicode/view.cc: New test. * testsuite/std/format/functions/format.cc: Add missing examples from the standard demonstrating alignment with non-ASCII characters. Add examples checking correct handling of extended grapheme clusters. contrib/ChangeLog: * unicode/README: Add notes about generating libstdc++ tables. * unicode/GraphemeBreakProperty.txt: New file. * unicode/emoji-data.txt: New file. * unicode/gen_libstdcxx_unicode_data.py: New file.
2023-12-16 23:30:20 +00:00
// { dg-do run { target c++20 } }
#include <format>
#include <string_view>
#include <ranges>
#include <testsuite_hooks.h>
namespace uc = std::__unicode;
using namespace std::string_view_literals;
constexpr void
test_utf8_to_utf8()
{
const auto s8 = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡"sv;
libstdc++: Add Unicode-aware width estimation for std::format This implements the requirements in the following proposals, which dictate how std::format deals with non-ASCII strings: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1868r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2572r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2675r1.pdf There are two parts to this. The width estimation for strings must only count the width of the first character in an extended grapheme cluster. That requires implementing the algorithm for detecting cluster breaks, which requires a number of lookup tables of the grapheme cluster break properties (and Indic_Conjunct_Break and Extended_Pictographic properties) of every code point. Additionally, some characters have a field width of 2, which requires another lookup table of field widths for every code point. The tables added in this commit do not contain entries for every code point from 0 to 0x10FFFF as that would be very inefficient and use too much memory. Instead the tables only contain the code points that form an "edge" for a property, omitting all the code points that have the same property as the preceding one. We can use a binary search to find the closest code point in the table that is not greater than the one we're looking for. The tables are generated by a new Python script added to the contrib/unicode directory, and a new data file downloaded from the Unicode Consortium website. The rules for extended grapheme cluster breaking are implemented for the latest Unicode standard, version 15.1.0. libstdc++-v3/ChangeLog: * include/Makefile.am: Add new headers. * include/Makefile.in: Regenerate. * include/bits/unicode.h: New file. * include/bits/unicode-data.h: New file. * include/std/format: Include <bits/unicode.h>. (__literal_encoding_is_utf8): Move to <bits/unicode.h>. (_Spec::_M_fill): Change type to char32_t. (_Spec::_M_parse_fill_and_align): Read a Unicode scalar value instead of a single character. (__write_padded): Change __fill_char parameter to char32_t and encode it into the output. (__formatter_str::format): Use new __unicode::__field_width and __unicode::__truncate functions. * include/std/ostream: Adjust namespace qualification for __literal_encoding_is_utf8. * include/std/print: Likewise. * src/c++23/print.cc: Add [[unlikely]] attribute to error path. * testsuite/ext/unicode/view.cc: New test. * testsuite/std/format/functions/format.cc: Add missing examples from the standard demonstrating alignment with non-ASCII characters. Add examples checking correct handling of extended grapheme clusters. contrib/ChangeLog: * unicode/README: Add notes about generating libstdc++ tables. * unicode/GraphemeBreakProperty.txt: New file. * unicode/emoji-data.txt: New file. * unicode/gen_libstdcxx_unicode_data.py: New file.
2023-12-16 23:30:20 +00:00
uc::_Utf8_view v(s8);
VERIFY( std::ranges::distance(v) == s8.size() );
VERIFY( std::ranges::equal(v, s8) );
}
constexpr void
test_utf8_to_utf16()
{
const auto s8 = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡"sv;
libstdc++: Add Unicode-aware width estimation for std::format This implements the requirements in the following proposals, which dictate how std::format deals with non-ASCII strings: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1868r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2572r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2675r1.pdf There are two parts to this. The width estimation for strings must only count the width of the first character in an extended grapheme cluster. That requires implementing the algorithm for detecting cluster breaks, which requires a number of lookup tables of the grapheme cluster break properties (and Indic_Conjunct_Break and Extended_Pictographic properties) of every code point. Additionally, some characters have a field width of 2, which requires another lookup table of field widths for every code point. The tables added in this commit do not contain entries for every code point from 0 to 0x10FFFF as that would be very inefficient and use too much memory. Instead the tables only contain the code points that form an "edge" for a property, omitting all the code points that have the same property as the preceding one. We can use a binary search to find the closest code point in the table that is not greater than the one we're looking for. The tables are generated by a new Python script added to the contrib/unicode directory, and a new data file downloaded from the Unicode Consortium website. The rules for extended grapheme cluster breaking are implemented for the latest Unicode standard, version 15.1.0. libstdc++-v3/ChangeLog: * include/Makefile.am: Add new headers. * include/Makefile.in: Regenerate. * include/bits/unicode.h: New file. * include/bits/unicode-data.h: New file. * include/std/format: Include <bits/unicode.h>. (__literal_encoding_is_utf8): Move to <bits/unicode.h>. (_Spec::_M_fill): Change type to char32_t. (_Spec::_M_parse_fill_and_align): Read a Unicode scalar value instead of a single character. (__write_padded): Change __fill_char parameter to char32_t and encode it into the output. (__formatter_str::format): Use new __unicode::__field_width and __unicode::__truncate functions. * include/std/ostream: Adjust namespace qualification for __literal_encoding_is_utf8. * include/std/print: Likewise. * src/c++23/print.cc: Add [[unlikely]] attribute to error path. * testsuite/ext/unicode/view.cc: New test. * testsuite/std/format/functions/format.cc: Add missing examples from the standard demonstrating alignment with non-ASCII characters. Add examples checking correct handling of extended grapheme clusters. contrib/ChangeLog: * unicode/README: Add notes about generating libstdc++ tables. * unicode/GraphemeBreakProperty.txt: New file. * unicode/emoji-data.txt: New file. * unicode/gen_libstdcxx_unicode_data.py: New file.
2023-12-16 23:30:20 +00:00
const std::u16string_view s16 = u"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡";
uc::_Utf16_view v(s8);
VERIFY( std::ranges::distance(v) == s16.size() );
VERIFY( std::ranges::equal(v, s16) );
}
constexpr void
test_utf8_to_utf32()
{
const auto s8 = u8"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡"sv;
const auto s32 = U"£🇬🇧 €🇪🇺 æбçδé ♠♥♦♣ 🤡"sv;
uc::_Utf32_view v(s8);
VERIFY( std::ranges::distance(v) == s32.size() );
VERIFY( std::ranges::equal(v, s32) );
}
constexpr void
test_illformed_utf8()
{
uc::_Utf32_view v("\xa3 10.99 \xee \xdd"sv);
VERIFY( std::ranges::equal(v, U"\uFFFD 10.99 \uFFFD \uFFFD"sv) );
uc::_Utf16_view v2(" \xf8\x80\x80\x80 "sv);
VERIFY( std::ranges::distance(v2) == 6 );
VERIFY( std::ranges::equal(v2, U" \uFFFD\uFFFD\uFFFD\uFFFD "sv) );
// Examples of U+FFFD substitution from Unicode standard.
uc::_Utf8_view v3("\xc0\xaf\xe0\x80\xbf\xf0\x81\x82\x41"sv); // Table 3-8
VERIFY( std::ranges::equal(v3, u8"\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\x41"sv) );
uc::_Utf8_view v4("\xed\xa0\x80\xed\xbf\xbf\xed\xaf\x41"sv); // Table 3-9
VERIFY( std::ranges::equal(v4, u8"\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\x41"sv) );
uc::_Utf8_view v5("\xf4\x91\x92\x93\xff\x41\x80\xbf\x42"sv); // Table 3-10
VERIFY( std::ranges::equal(v5, u8"\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\x41\uFFFD\uFFFD\x42"sv) );
uc::_Utf8_view v6("\xe1\x80\xe2\xf0\x91\x92\xf1\xbf\x41"sv); // Table 3-11
VERIFY( std::ranges::equal(v6, u8"\uFFFD\uFFFD\uFFFD\uFFFD\x41"sv) );
uc::_Utf32_view v7("\xe1\x80"sv);
VERIFY( std::ranges::equal(v7, U"\uFFFD"sv) );
uc::_Utf32_view v8("\xf1\x80"sv);
VERIFY( std::ranges::equal(v8, U"\uFFFD"sv) );
uc::_Utf32_view v9("\xf1\x80\x80"sv);
VERIFY( std::ranges::equal(v9, U"\uFFFD"sv) );
libstdc++: Add Unicode-aware width estimation for std::format This implements the requirements in the following proposals, which dictate how std::format deals with non-ASCII strings: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1868r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2572r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2675r1.pdf There are two parts to this. The width estimation for strings must only count the width of the first character in an extended grapheme cluster. That requires implementing the algorithm for detecting cluster breaks, which requires a number of lookup tables of the grapheme cluster break properties (and Indic_Conjunct_Break and Extended_Pictographic properties) of every code point. Additionally, some characters have a field width of 2, which requires another lookup table of field widths for every code point. The tables added in this commit do not contain entries for every code point from 0 to 0x10FFFF as that would be very inefficient and use too much memory. Instead the tables only contain the code points that form an "edge" for a property, omitting all the code points that have the same property as the preceding one. We can use a binary search to find the closest code point in the table that is not greater than the one we're looking for. The tables are generated by a new Python script added to the contrib/unicode directory, and a new data file downloaded from the Unicode Consortium website. The rules for extended grapheme cluster breaking are implemented for the latest Unicode standard, version 15.1.0. libstdc++-v3/ChangeLog: * include/Makefile.am: Add new headers. * include/Makefile.in: Regenerate. * include/bits/unicode.h: New file. * include/bits/unicode-data.h: New file. * include/std/format: Include <bits/unicode.h>. (__literal_encoding_is_utf8): Move to <bits/unicode.h>. (_Spec::_M_fill): Change type to char32_t. (_Spec::_M_parse_fill_and_align): Read a Unicode scalar value instead of a single character. (__write_padded): Change __fill_char parameter to char32_t and encode it into the output. (__formatter_str::format): Use new __unicode::__field_width and __unicode::__truncate functions. * include/std/ostream: Adjust namespace qualification for __literal_encoding_is_utf8. * include/std/print: Likewise. * src/c++23/print.cc: Add [[unlikely]] attribute to error path. * testsuite/ext/unicode/view.cc: New test. * testsuite/std/format/functions/format.cc: Add missing examples from the standard demonstrating alignment with non-ASCII characters. Add examples checking correct handling of extended grapheme clusters. contrib/ChangeLog: * unicode/README: Add notes about generating libstdc++ tables. * unicode/GraphemeBreakProperty.txt: New file. * unicode/emoji-data.txt: New file. * unicode/gen_libstdcxx_unicode_data.py: New file.
2023-12-16 23:30:20 +00:00
}
constexpr void
test_illformed_utf16()
{
std::u16string_view s = u"\N{CLOWN FACE}";
std::u16string_view r = u"\uFFFD";
VERIFY( std::ranges::equal(uc::_Utf16_view(s.substr(0, 1)), r) );
VERIFY( std::ranges::equal(uc::_Utf16_view(s.substr(1, 1)), r) );
std::array s2{ s[0], s[0] };
VERIFY( std::ranges::equal(uc::_Utf16_view(s2), u"\uFFFD\uFFFD"sv) );
std::array s3{ s[0], s[0], s[1] };
VERIFY( std::ranges::equal(uc::_Utf16_view(s3), u"\uFFFD\N{CLOWN FACE}"sv) );
std::array s4{ s[1], s[0] };
VERIFY( std::ranges::equal(uc::_Utf16_view(s4), u"\uFFFD\uFFFD"sv) );
std::array s5{ s[1], s[0], s[1] };
VERIFY( std::ranges::equal(uc::_Utf16_view(s5), u"\uFFFD\N{CLOWN FACE}"sv) );
}
constexpr void
test_illformed_utf32()
{
std::u32string_view s = U"\x110000";
VERIFY( std::ranges::equal(uc::_Utf32_view(s), U"\uFFFD"sv) );
s = U"\xFFFFFF";
VERIFY( std::ranges::equal(uc::_Utf32_view(s), U"\uFFFD"sv) );
s = U"\xFFFFFFF0";
VERIFY( std::ranges::equal(uc::_Utf32_view(s), U"\uFFFD"sv) );
}
libstdc++: Fix Unicode property detection functions Fix some copy & pasted logic in __is_extended_pictographic. This function should yield false for the values before the first edge, not true. Also add a missing boundary condition check in __incb_property. Also Fix an off-by-one error in _Utf_iterator::operator++() that would make dereferencing a past-the-end iterator undefined (where the intended design is that the iterator is always incrementable and dereferenceable, for better memory safety). Also simplify the grapheme view iterator, which still contained some remnants of an earlier design I was experimenting with. Slightly tweak the gen_libstdcxx_unicode_data.py script so that the _Gcb_property enumerators are in the order we encounter them in the data file, instead of sorting them alphabetically. Start with the "Other" property at value 0, because that's the default property for anything not in the file. This makes no practical difference, but seems cleaner. It causes the values in the __gcb_edges table to change, so can only be done now before anybody is using this code yet. The enumerator values and table entries become ABI artefacts for the function using them. contrib/ChangeLog: * unicode/gen_libstdcxx_unicode_data.py: Print out Gcb_property enumerators in the order they're seen, not alphabetical order. libstdc++-v3/ChangeLog: * include/bits/unicode-data.h: Regenerate. * include/bits/unicode.h (_Utf_iterator::operator++()): Fix off by one error. (__incb_property): Add missing check for values before the first edge. (__is_extended_pictographic): Invert return values to fix copy&pasted logic. (_Grapheme_cluster_view::_Iterator): Remove second iterator member and find end of cluster lazily. * testsuite/ext/unicode/grapheme_view.cc: New test. * testsuite/ext/unicode/properties.cc: New test. * testsuite/ext/unicode/view.cc: New test.
2024-01-09 14:43:40 +00:00
constexpr void
test_past_the_end()
{
const auto s8 = u8"1234"sv;
uc::_Utf32_view v(s8);
auto iter = v.begin();
std::advance(iter, 4);
VERIFY( iter == v.end() );
// Incrementing past the end has well-defined behaviour.
++iter;
VERIFY( iter == v.end() );
VERIFY( *iter == U'4' ); // Still dereferenceable.
++iter;
VERIFY( iter == v.end() );
VERIFY( *iter == U'4' );
iter++;
VERIFY( iter == v.end() );
VERIFY( *iter == U'4' );
std::string_view empty;
uc::_Utf32_view v2(empty);
auto iter2 = v2.begin();
VERIFY( iter2 == v2.end() );
VERIFY( *iter2 == U'\0' );
iter++;
VERIFY( iter2 == v2.end() );
VERIFY( *iter2 == U'\0' );
}
libstdc++: Add Unicode-aware width estimation for std::format This implements the requirements in the following proposals, which dictate how std::format deals with non-ASCII strings: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1868r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2572r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2675r1.pdf There are two parts to this. The width estimation for strings must only count the width of the first character in an extended grapheme cluster. That requires implementing the algorithm for detecting cluster breaks, which requires a number of lookup tables of the grapheme cluster break properties (and Indic_Conjunct_Break and Extended_Pictographic properties) of every code point. Additionally, some characters have a field width of 2, which requires another lookup table of field widths for every code point. The tables added in this commit do not contain entries for every code point from 0 to 0x10FFFF as that would be very inefficient and use too much memory. Instead the tables only contain the code points that form an "edge" for a property, omitting all the code points that have the same property as the preceding one. We can use a binary search to find the closest code point in the table that is not greater than the one we're looking for. The tables are generated by a new Python script added to the contrib/unicode directory, and a new data file downloaded from the Unicode Consortium website. The rules for extended grapheme cluster breaking are implemented for the latest Unicode standard, version 15.1.0. libstdc++-v3/ChangeLog: * include/Makefile.am: Add new headers. * include/Makefile.in: Regenerate. * include/bits/unicode.h: New file. * include/bits/unicode-data.h: New file. * include/std/format: Include <bits/unicode.h>. (__literal_encoding_is_utf8): Move to <bits/unicode.h>. (_Spec::_M_fill): Change type to char32_t. (_Spec::_M_parse_fill_and_align): Read a Unicode scalar value instead of a single character. (__write_padded): Change __fill_char parameter to char32_t and encode it into the output. (__formatter_str::format): Use new __unicode::__field_width and __unicode::__truncate functions. * include/std/ostream: Adjust namespace qualification for __literal_encoding_is_utf8. * include/std/print: Likewise. * src/c++23/print.cc: Add [[unlikely]] attribute to error path. * testsuite/ext/unicode/view.cc: New test. * testsuite/std/format/functions/format.cc: Add missing examples from the standard demonstrating alignment with non-ASCII characters. Add examples checking correct handling of extended grapheme clusters. contrib/ChangeLog: * unicode/README: Add notes about generating libstdc++ tables. * unicode/GraphemeBreakProperty.txt: New file. * unicode/emoji-data.txt: New file. * unicode/gen_libstdcxx_unicode_data.py: New file.
2023-12-16 23:30:20 +00:00
int main()
{
auto run_tests = []{
test_utf8_to_utf8();
test_utf8_to_utf16();
test_utf8_to_utf32();
test_illformed_utf8();
test_illformed_utf16();
test_illformed_utf32();
libstdc++: Fix Unicode property detection functions Fix some copy & pasted logic in __is_extended_pictographic. This function should yield false for the values before the first edge, not true. Also add a missing boundary condition check in __incb_property. Also Fix an off-by-one error in _Utf_iterator::operator++() that would make dereferencing a past-the-end iterator undefined (where the intended design is that the iterator is always incrementable and dereferenceable, for better memory safety). Also simplify the grapheme view iterator, which still contained some remnants of an earlier design I was experimenting with. Slightly tweak the gen_libstdcxx_unicode_data.py script so that the _Gcb_property enumerators are in the order we encounter them in the data file, instead of sorting them alphabetically. Start with the "Other" property at value 0, because that's the default property for anything not in the file. This makes no practical difference, but seems cleaner. It causes the values in the __gcb_edges table to change, so can only be done now before anybody is using this code yet. The enumerator values and table entries become ABI artefacts for the function using them. contrib/ChangeLog: * unicode/gen_libstdcxx_unicode_data.py: Print out Gcb_property enumerators in the order they're seen, not alphabetical order. libstdc++-v3/ChangeLog: * include/bits/unicode-data.h: Regenerate. * include/bits/unicode.h (_Utf_iterator::operator++()): Fix off by one error. (__incb_property): Add missing check for values before the first edge. (__is_extended_pictographic): Invert return values to fix copy&pasted logic. (_Grapheme_cluster_view::_Iterator): Remove second iterator member and find end of cluster lazily. * testsuite/ext/unicode/grapheme_view.cc: New test. * testsuite/ext/unicode/properties.cc: New test. * testsuite/ext/unicode/view.cc: New test.
2024-01-09 14:43:40 +00:00
test_past_the_end();
libstdc++: Add Unicode-aware width estimation for std::format This implements the requirements in the following proposals, which dictate how std::format deals with non-ASCII strings: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1868r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2572r1.html https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2675r1.pdf There are two parts to this. The width estimation for strings must only count the width of the first character in an extended grapheme cluster. That requires implementing the algorithm for detecting cluster breaks, which requires a number of lookup tables of the grapheme cluster break properties (and Indic_Conjunct_Break and Extended_Pictographic properties) of every code point. Additionally, some characters have a field width of 2, which requires another lookup table of field widths for every code point. The tables added in this commit do not contain entries for every code point from 0 to 0x10FFFF as that would be very inefficient and use too much memory. Instead the tables only contain the code points that form an "edge" for a property, omitting all the code points that have the same property as the preceding one. We can use a binary search to find the closest code point in the table that is not greater than the one we're looking for. The tables are generated by a new Python script added to the contrib/unicode directory, and a new data file downloaded from the Unicode Consortium website. The rules for extended grapheme cluster breaking are implemented for the latest Unicode standard, version 15.1.0. libstdc++-v3/ChangeLog: * include/Makefile.am: Add new headers. * include/Makefile.in: Regenerate. * include/bits/unicode.h: New file. * include/bits/unicode-data.h: New file. * include/std/format: Include <bits/unicode.h>. (__literal_encoding_is_utf8): Move to <bits/unicode.h>. (_Spec::_M_fill): Change type to char32_t. (_Spec::_M_parse_fill_and_align): Read a Unicode scalar value instead of a single character. (__write_padded): Change __fill_char parameter to char32_t and encode it into the output. (__formatter_str::format): Use new __unicode::__field_width and __unicode::__truncate functions. * include/std/ostream: Adjust namespace qualification for __literal_encoding_is_utf8. * include/std/print: Likewise. * src/c++23/print.cc: Add [[unlikely]] attribute to error path. * testsuite/ext/unicode/view.cc: New test. * testsuite/std/format/functions/format.cc: Add missing examples from the standard demonstrating alignment with non-ASCII characters. Add examples checking correct handling of extended grapheme clusters. contrib/ChangeLog: * unicode/README: Add notes about generating libstdc++ tables. * unicode/GraphemeBreakProperty.txt: New file. * unicode/emoji-data.txt: New file. * unicode/gen_libstdcxx_unicode_data.py: New file.
2023-12-16 23:30:20 +00:00
return true;
};
VERIFY( run_tests() );
static_assert( run_tests() );
}