gcc/libstdc++-v3/testsuite/27_io/print/2.cc

157 lines
4.8 KiB
C++
Raw Permalink Normal View History

libstdc++: Implement C++23 <print> header [PR107760] This adds the C++23 std::print functions, which use std::format to write to a FILE stream or std::ostream (defaulting to stdout). The new extern symbols are in the libstdc++exp.a archive, so we aren't committing to stable symbols in the DSO yet. There's a UTF-8 validating and transcoding function added by this change. That can certainly be optimized, but it's internal to libstdc++exp.a so can be tweaked later at leisure. Currently the external symbols work for all targets, but are only actually used for Windows, where it's necessary to transcode to UTF-16 to write to the console. The standard seems to encourage us to also diagnose invalid UTF-8 for non-Windows targets when writing to a terminal (and only when writing to a terminal), but I'm reliably informed that that wasn't the intent of the wording. Checking for invalid UTF-8 sequences only needs to happen for Windows, which is good as checking for a terminal requires a call to isatty, and on Linux that uses an ioctl syscall, which would make std::print ten times slower! Testing the std::print behaviour is difficult if it depends on whether the output stream is connected to a Windows console or not, as we can't (as far as I know) do that non-interactively in DejaGNU. One of the new tests uses the internal __write_to_terminal function directly. That allows us to verify its UTF-8 error handling on POSIX targets, even though that's not actually used by std::print. For Windows, that __write_to_terminal function transcodes to UTF-16 but then uses WriteConsoleW which fails unless it really is writing to the console. That means the 27_io/print/2.cc test FAILs on Windows. The UTF-16 transcoding has been manually tested using mingw-w64 and Wine, and appears to work. libstdc++-v3/ChangeLog: PR libstdc++/107760 * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/bits/version.def (__cpp_lib_print): Define. * include/bits/version.h: Regenerate. * include/std/format (__literal_encoding_is_utf8): New function. (_Seq_sink::view()): New member function. * include/std/ostream (vprintf_nonunicode, vprintf_unicode) (print, println): New functions. * include/std/print: New file. * src/c++23/Makefile.am: Add new source file. * src/c++23/Makefile.in: Regenerate. * src/c++23/print.cc: New file. * testsuite/27_io/basic_ostream/print/1.cc: New test. * testsuite/27_io/print/1.cc: New test. * testsuite/27_io/print/2.cc: New test.
2023-12-14 23:23:34 +00:00
// { dg-options "-lstdc++exp" }
// { dg-do run { target c++23 } }
// { dg-require-fileio "" }
#include <print>
#include <system_error>
#include <climits>
#include <cstdio>
#include <cstring>
#include <testsuite_hooks.h>
#include <testsuite_fs.h>
#ifdef _WIN32
#include <io.h>
#endif
namespace std
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
// This is an internal implementation detail that must not be used directly.
// We need to use it here to test the behaviour
error_code __write_to_terminal(void*, span<char>);
_GLIBCXX_END_NAMESPACE_VERSION
}
// Test the internal __write_to_terminal function that vprintf_unicode uses.
// The string parameter will be written to a file, then the bytes of the file
// will be read back again. On Windows those bytes will be a UTF-16 string.
// Returns true if the string was valid UTF-8.
bool
as_printed_to_terminal(std::string& s)
{
__gnu_test::scoped_file f;
FILE* strm = std::fopen(f.path.string().c_str(), "w");
VERIFY( strm );
#ifdef _WIN32
void* handle = (void*)_get_osfhandle(_fileno(strm));
const auto ec = std::__write_to_terminal(handle, s);
#else
const auto ec = std::__write_to_terminal(strm, s);
#endif
if (ec && ec != std::make_error_code(std::errc::illegal_byte_sequence))
{
std::println("Failed to : {}", ec.message());
VERIFY(!ec);
}
libstdc++: Implement C++23 <print> header [PR107760] This adds the C++23 std::print functions, which use std::format to write to a FILE stream or std::ostream (defaulting to stdout). The new extern symbols are in the libstdc++exp.a archive, so we aren't committing to stable symbols in the DSO yet. There's a UTF-8 validating and transcoding function added by this change. That can certainly be optimized, but it's internal to libstdc++exp.a so can be tweaked later at leisure. Currently the external symbols work for all targets, but are only actually used for Windows, where it's necessary to transcode to UTF-16 to write to the console. The standard seems to encourage us to also diagnose invalid UTF-8 for non-Windows targets when writing to a terminal (and only when writing to a terminal), but I'm reliably informed that that wasn't the intent of the wording. Checking for invalid UTF-8 sequences only needs to happen for Windows, which is good as checking for a terminal requires a call to isatty, and on Linux that uses an ioctl syscall, which would make std::print ten times slower! Testing the std::print behaviour is difficult if it depends on whether the output stream is connected to a Windows console or not, as we can't (as far as I know) do that non-interactively in DejaGNU. One of the new tests uses the internal __write_to_terminal function directly. That allows us to verify its UTF-8 error handling on POSIX targets, even though that's not actually used by std::print. For Windows, that __write_to_terminal function transcodes to UTF-16 but then uses WriteConsoleW which fails unless it really is writing to the console. That means the 27_io/print/2.cc test FAILs on Windows. The UTF-16 transcoding has been manually tested using mingw-w64 and Wine, and appears to work. libstdc++-v3/ChangeLog: PR libstdc++/107760 * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/bits/version.def (__cpp_lib_print): Define. * include/bits/version.h: Regenerate. * include/std/format (__literal_encoding_is_utf8): New function. (_Seq_sink::view()): New member function. * include/std/ostream (vprintf_nonunicode, vprintf_unicode) (print, println): New functions. * include/std/print: New file. * src/c++23/Makefile.am: Add new source file. * src/c++23/Makefile.in: Regenerate. * src/c++23/print.cc: New file. * testsuite/27_io/basic_ostream/print/1.cc: New test. * testsuite/27_io/print/1.cc: New test. * testsuite/27_io/print/2.cc: New test.
2023-12-14 23:23:34 +00:00
std::fclose(strm);
std::ifstream in(f.path);
s.assign(std::istreambuf_iterator<char>(in), {});
return !ec;
}
void
test_utf8_validation()
{
#ifndef _WIN32
std::string s = (const char*)u8"£🇬🇧 €🇪🇺";
const std::string s2 = s;
VERIFY( as_printed_to_terminal(s) );
VERIFY( s == s2 );
s += " \xa3 10.99 \xee \xdd";
const std::string s3 = s;
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( s != s3 );
std::string repl = (const char*)u8"\uFFFD";
const std::string s4 = s2 + " " + repl + " 10.99 " + repl + " " + repl;
VERIFY( s == s4 );
s = "\xc0\x80";
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( s == repl + repl );
s = "\xc0\xae";
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( s == repl + repl );
// Examples of U+FFFD substitution from Unicode standard.
std::string r4 = repl + repl + repl + repl;
s = "\xc0\xaf\xe0\x80\xbf\xf0\x81\x82\x41"; // Table 3-8
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( s == r4 + r4 + "\x41" );
s = "\xed\xa0\x80\xed\xbf\xbf\xed\xaf\x41"; // Table 3-9
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( s == r4 + r4 + "\x41" );
s = "\xf4\x91\x92\x93\xff\x41\x80\xbf\x42"; // Table 3-10
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( s == r4 + repl + "\x41" + repl + repl + "\x42" );
s = "\xe1\x80\xe2\xf0\x91\x92\xf1\xbf\x41"; // Table 3-11
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( s == r4 + "\x41" );
#endif
}
// Create a std::u16string from the bytes in a std::string.
std::u16string
utf16_from_bytes(const std::string& s)
{
std::u16string u16;
// s should have an even number of bytes. If it doesn't, we'll copy its
// null terminator into the result, which will not match the expected value.
const auto len = (s.size() + 1) / 2;
u16.resize_and_overwrite(len, [&s](char16_t* p, size_t n) {
std::memcpy(p, s.data(), n * sizeof(char16_t));
return n;
});
return u16;
}
void
test_utf16_transcoding()
{
#ifdef _WIN32
// FIXME: We can't test __write_to_terminal for Windows, because it
// returns an INVALID_HANDLE Windows error when writing to a normal file.
std::string s = (const char*)u8"£🇬🇧 €🇪🇺";
const std::u16string s2 = u"£🇬🇧 €🇪🇺";
VERIFY( as_printed_to_terminal(s) );
VERIFY( utf16_from_bytes(s) == s2 );
s = (const char*)u8"£🇬🇧 €🇪🇺";
libstdc++: Implement C++23 <print> header [PR107760] This adds the C++23 std::print functions, which use std::format to write to a FILE stream or std::ostream (defaulting to stdout). The new extern symbols are in the libstdc++exp.a archive, so we aren't committing to stable symbols in the DSO yet. There's a UTF-8 validating and transcoding function added by this change. That can certainly be optimized, but it's internal to libstdc++exp.a so can be tweaked later at leisure. Currently the external symbols work for all targets, but are only actually used for Windows, where it's necessary to transcode to UTF-16 to write to the console. The standard seems to encourage us to also diagnose invalid UTF-8 for non-Windows targets when writing to a terminal (and only when writing to a terminal), but I'm reliably informed that that wasn't the intent of the wording. Checking for invalid UTF-8 sequences only needs to happen for Windows, which is good as checking for a terminal requires a call to isatty, and on Linux that uses an ioctl syscall, which would make std::print ten times slower! Testing the std::print behaviour is difficult if it depends on whether the output stream is connected to a Windows console or not, as we can't (as far as I know) do that non-interactively in DejaGNU. One of the new tests uses the internal __write_to_terminal function directly. That allows us to verify its UTF-8 error handling on POSIX targets, even though that's not actually used by std::print. For Windows, that __write_to_terminal function transcodes to UTF-16 but then uses WriteConsoleW which fails unless it really is writing to the console. That means the 27_io/print/2.cc test FAILs on Windows. The UTF-16 transcoding has been manually tested using mingw-w64 and Wine, and appears to work. libstdc++-v3/ChangeLog: PR libstdc++/107760 * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/bits/version.def (__cpp_lib_print): Define. * include/bits/version.h: Regenerate. * include/std/format (__literal_encoding_is_utf8): New function. (_Seq_sink::view()): New member function. * include/std/ostream (vprintf_nonunicode, vprintf_unicode) (print, println): New functions. * include/std/print: New file. * src/c++23/Makefile.am: Add new source file. * src/c++23/Makefile.in: Regenerate. * src/c++23/print.cc: New file. * testsuite/27_io/basic_ostream/print/1.cc: New test. * testsuite/27_io/print/1.cc: New test. * testsuite/27_io/print/2.cc: New test.
2023-12-14 23:23:34 +00:00
s += " \xa3 10.99 \xee\xdd";
VERIFY( ! as_printed_to_terminal(s) );
std::u16string repl = u"\uFFFD";
const std::u16string s3 = s2 + u" " + repl + u" 10.99 " + repl + repl;
VERIFY( utf16_from_bytes(s) == s3 );
s = "\xc0\x80";
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( utf16_from_bytes(s) == repl + repl );
s = "\xc0\xae";
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( utf16_from_bytes(s) == repl + repl );
// Examples of U+FFFD substitution from Unicode standard.
std::u16string r4 = repl + repl + repl + repl;
s = "\xc0\xaf\xe0\x80\xbf\xf0\x81\x82\x41"; // Table 3-8
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( utf16_from_bytes(s) == r4 + r4 + u"\x41" );
s = "\xed\xa0\x80\xed\xbf\xbf\xed\xaf\x41"; // Table 3-9
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( utf16_from_bytes(s) == r4 + r4 + u"\x41" );
s = "\xf4\x91\x92\x93\xff\x41\x80\xbf\x42"; // Table 3-10
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( utf16_from_bytes(s) == r4 + repl + u"\x41" + repl + repl + u"\x42" );
s = "\xe1\x80\xe2\xf0\x91\x92\xf1\xbf\x41"; // Table 3-11
VERIFY( ! as_printed_to_terminal(s) );
VERIFY( utf16_from_bytes(s) == r4 + u"\x41" );
#endif
}
int main()
{
test_utf8_validation();
test_utf16_transcoding();
}