libstdc++: Implement C++23 <print> header [PR107760]
This adds the C++23 std::print functions, which use std::format to write
to a FILE stream or std::ostream (defaulting to stdout).
The new extern symbols are in the libstdc++exp.a archive, so we aren't
committing to stable symbols in the DSO yet. There's a UTF-8 validating
and transcoding function added by this change. That can certainly be
optimized, but it's internal to libstdc++exp.a so can be tweaked later
at leisure.
Currently the external symbols work for all targets, but are only
actually used for Windows, where it's necessary to transcode to UTF-16
to write to the console. The standard seems to encourage us to also
diagnose invalid UTF-8 for non-Windows targets when writing to a
terminal (and only when writing to a terminal), but I'm reliably
informed that that wasn't the intent of the wording. Checking for
invalid UTF-8 sequences only needs to happen for Windows, which is good
as checking for a terminal requires a call to isatty, and on Linux that
uses an ioctl syscall, which would make std::print ten times slower!
Testing the std::print behaviour is difficult if it depends on whether
the output stream is connected to a Windows console or not, as we can't
(as far as I know) do that non-interactively in DejaGNU. One of the new
tests uses the internal __write_to_terminal function directly. That
allows us to verify its UTF-8 error handling on POSIX targets, even
though that's not actually used by std::print. For Windows, that
__write_to_terminal function transcodes to UTF-16 but then uses
WriteConsoleW which fails unless it really is writing to the console.
That means the 27_io/print/2.cc test FAILs on Windows. The UTF-16
transcoding has been manually tested using mingw-w64 and Wine, and
appears to work.
libstdc++-v3/ChangeLog:
PR libstdc++/107760
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/version.def (__cpp_lib_print): Define.
* include/bits/version.h: Regenerate.
* include/std/format (__literal_encoding_is_utf8): New function.
(_Seq_sink::view()): New member function.
* include/std/ostream (vprintf_nonunicode, vprintf_unicode)
(print, println): New functions.
* include/std/print: New file.
* src/c++23/Makefile.am: Add new source file.
* src/c++23/Makefile.in: Regenerate.
* src/c++23/print.cc: New file.
* testsuite/27_io/basic_ostream/print/1.cc: New test.
* testsuite/27_io/print/1.cc: New test.
* testsuite/27_io/print/2.cc: New test.
2023-12-14 23:23:34 +00:00
|
|
|
// { dg-options "-lstdc++exp" }
|
|
|
|
// { dg-do run { target c++23 } }
|
|
|
|
// { dg-require-fileio "" }
|
|
|
|
|
|
|
|
#include <print>
|
|
|
|
#include <system_error>
|
|
|
|
#include <climits>
|
|
|
|
#include <cstdio>
|
|
|
|
#include <cstring>
|
|
|
|
#include <testsuite_hooks.h>
|
|
|
|
#include <testsuite_fs.h>
|
|
|
|
|
|
|
|
#ifdef _WIN32
|
|
|
|
#include <io.h>
|
|
|
|
#endif
|
|
|
|
|
|
|
|
namespace std
|
|
|
|
{
|
|
|
|
_GLIBCXX_BEGIN_NAMESPACE_VERSION
|
|
|
|
// This is an internal implementation detail that must not be used directly.
|
|
|
|
// We need to use it here to test the behaviour
|
|
|
|
error_code __write_to_terminal(void*, span<char>);
|
|
|
|
_GLIBCXX_END_NAMESPACE_VERSION
|
|
|
|
}
|
|
|
|
|
|
|
|
// Test the internal __write_to_terminal function that vprintf_unicode uses.
|
|
|
|
// The string parameter will be written to a file, then the bytes of the file
|
|
|
|
// will be read back again. On Windows those bytes will be a UTF-16 string.
|
|
|
|
// Returns true if the string was valid UTF-8.
|
|
|
|
bool
|
|
|
|
as_printed_to_terminal(std::string& s)
|
|
|
|
{
|
|
|
|
__gnu_test::scoped_file f;
|
|
|
|
FILE* strm = std::fopen(f.path.string().c_str(), "w");
|
|
|
|
VERIFY( strm );
|
|
|
|
#ifdef _WIN32
|
|
|
|
void* handle = (void*)_get_osfhandle(_fileno(strm));
|
|
|
|
const auto ec = std::__write_to_terminal(handle, s);
|
|
|
|
#else
|
|
|
|
const auto ec = std::__write_to_terminal(strm, s);
|
|
|
|
#endif
|
2023-12-15 12:58:37 +00:00
|
|
|
if (ec && ec != std::make_error_code(std::errc::illegal_byte_sequence))
|
|
|
|
{
|
|
|
|
std::println("Failed to : {}", ec.message());
|
|
|
|
VERIFY(!ec);
|
|
|
|
}
|
libstdc++: Implement C++23 <print> header [PR107760]
This adds the C++23 std::print functions, which use std::format to write
to a FILE stream or std::ostream (defaulting to stdout).
The new extern symbols are in the libstdc++exp.a archive, so we aren't
committing to stable symbols in the DSO yet. There's a UTF-8 validating
and transcoding function added by this change. That can certainly be
optimized, but it's internal to libstdc++exp.a so can be tweaked later
at leisure.
Currently the external symbols work for all targets, but are only
actually used for Windows, where it's necessary to transcode to UTF-16
to write to the console. The standard seems to encourage us to also
diagnose invalid UTF-8 for non-Windows targets when writing to a
terminal (and only when writing to a terminal), but I'm reliably
informed that that wasn't the intent of the wording. Checking for
invalid UTF-8 sequences only needs to happen for Windows, which is good
as checking for a terminal requires a call to isatty, and on Linux that
uses an ioctl syscall, which would make std::print ten times slower!
Testing the std::print behaviour is difficult if it depends on whether
the output stream is connected to a Windows console or not, as we can't
(as far as I know) do that non-interactively in DejaGNU. One of the new
tests uses the internal __write_to_terminal function directly. That
allows us to verify its UTF-8 error handling on POSIX targets, even
though that's not actually used by std::print. For Windows, that
__write_to_terminal function transcodes to UTF-16 but then uses
WriteConsoleW which fails unless it really is writing to the console.
That means the 27_io/print/2.cc test FAILs on Windows. The UTF-16
transcoding has been manually tested using mingw-w64 and Wine, and
appears to work.
libstdc++-v3/ChangeLog:
PR libstdc++/107760
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/version.def (__cpp_lib_print): Define.
* include/bits/version.h: Regenerate.
* include/std/format (__literal_encoding_is_utf8): New function.
(_Seq_sink::view()): New member function.
* include/std/ostream (vprintf_nonunicode, vprintf_unicode)
(print, println): New functions.
* include/std/print: New file.
* src/c++23/Makefile.am: Add new source file.
* src/c++23/Makefile.in: Regenerate.
* src/c++23/print.cc: New file.
* testsuite/27_io/basic_ostream/print/1.cc: New test.
* testsuite/27_io/print/1.cc: New test.
* testsuite/27_io/print/2.cc: New test.
2023-12-14 23:23:34 +00:00
|
|
|
std::fclose(strm);
|
|
|
|
std::ifstream in(f.path);
|
|
|
|
s.assign(std::istreambuf_iterator<char>(in), {});
|
|
|
|
return !ec;
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
test_utf8_validation()
|
|
|
|
{
|
|
|
|
#ifndef _WIN32
|
|
|
|
std::string s = (const char*)u8"£🇬🇧 €🇪🇺";
|
|
|
|
const std::string s2 = s;
|
|
|
|
VERIFY( as_printed_to_terminal(s) );
|
|
|
|
VERIFY( s == s2 );
|
|
|
|
|
|
|
|
s += " \xa3 10.99 \xee \xdd";
|
|
|
|
const std::string s3 = s;
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( s != s3 );
|
|
|
|
std::string repl = (const char*)u8"\uFFFD";
|
|
|
|
const std::string s4 = s2 + " " + repl + " 10.99 " + repl + " " + repl;
|
|
|
|
VERIFY( s == s4 );
|
|
|
|
|
|
|
|
s = "\xc0\x80";
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( s == repl + repl );
|
|
|
|
s = "\xc0\xae";
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( s == repl + repl );
|
|
|
|
|
|
|
|
// Examples of U+FFFD substitution from Unicode standard.
|
|
|
|
std::string r4 = repl + repl + repl + repl;
|
|
|
|
s = "\xc0\xaf\xe0\x80\xbf\xf0\x81\x82\x41"; // Table 3-8
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( s == r4 + r4 + "\x41" );
|
|
|
|
s = "\xed\xa0\x80\xed\xbf\xbf\xed\xaf\x41"; // Table 3-9
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( s == r4 + r4 + "\x41" );
|
|
|
|
s = "\xf4\x91\x92\x93\xff\x41\x80\xbf\x42"; // Table 3-10
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( s == r4 + repl + "\x41" + repl + repl + "\x42" );
|
|
|
|
s = "\xe1\x80\xe2\xf0\x91\x92\xf1\xbf\x41"; // Table 3-11
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( s == r4 + "\x41" );
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
// Create a std::u16string from the bytes in a std::string.
|
|
|
|
std::u16string
|
|
|
|
utf16_from_bytes(const std::string& s)
|
|
|
|
{
|
|
|
|
std::u16string u16;
|
|
|
|
// s should have an even number of bytes. If it doesn't, we'll copy its
|
|
|
|
// null terminator into the result, which will not match the expected value.
|
|
|
|
const auto len = (s.size() + 1) / 2;
|
|
|
|
u16.resize_and_overwrite(len, [&s](char16_t* p, size_t n) {
|
|
|
|
std::memcpy(p, s.data(), n * sizeof(char16_t));
|
|
|
|
return n;
|
|
|
|
});
|
|
|
|
return u16;
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
test_utf16_transcoding()
|
|
|
|
{
|
|
|
|
#ifdef _WIN32
|
|
|
|
// FIXME: We can't test __write_to_terminal for Windows, because it
|
|
|
|
// returns an INVALID_HANDLE Windows error when writing to a normal file.
|
|
|
|
|
|
|
|
std::string s = (const char*)u8"£🇬🇧 €🇪🇺";
|
|
|
|
const std::u16string s2 = u"£🇬🇧 €🇪🇺";
|
|
|
|
VERIFY( as_printed_to_terminal(s) );
|
|
|
|
VERIFY( utf16_from_bytes(s) == s2 );
|
|
|
|
|
2023-12-15 12:58:37 +00:00
|
|
|
s = (const char*)u8"£🇬🇧 €🇪🇺";
|
libstdc++: Implement C++23 <print> header [PR107760]
This adds the C++23 std::print functions, which use std::format to write
to a FILE stream or std::ostream (defaulting to stdout).
The new extern symbols are in the libstdc++exp.a archive, so we aren't
committing to stable symbols in the DSO yet. There's a UTF-8 validating
and transcoding function added by this change. That can certainly be
optimized, but it's internal to libstdc++exp.a so can be tweaked later
at leisure.
Currently the external symbols work for all targets, but are only
actually used for Windows, where it's necessary to transcode to UTF-16
to write to the console. The standard seems to encourage us to also
diagnose invalid UTF-8 for non-Windows targets when writing to a
terminal (and only when writing to a terminal), but I'm reliably
informed that that wasn't the intent of the wording. Checking for
invalid UTF-8 sequences only needs to happen for Windows, which is good
as checking for a terminal requires a call to isatty, and on Linux that
uses an ioctl syscall, which would make std::print ten times slower!
Testing the std::print behaviour is difficult if it depends on whether
the output stream is connected to a Windows console or not, as we can't
(as far as I know) do that non-interactively in DejaGNU. One of the new
tests uses the internal __write_to_terminal function directly. That
allows us to verify its UTF-8 error handling on POSIX targets, even
though that's not actually used by std::print. For Windows, that
__write_to_terminal function transcodes to UTF-16 but then uses
WriteConsoleW which fails unless it really is writing to the console.
That means the 27_io/print/2.cc test FAILs on Windows. The UTF-16
transcoding has been manually tested using mingw-w64 and Wine, and
appears to work.
libstdc++-v3/ChangeLog:
PR libstdc++/107760
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/version.def (__cpp_lib_print): Define.
* include/bits/version.h: Regenerate.
* include/std/format (__literal_encoding_is_utf8): New function.
(_Seq_sink::view()): New member function.
* include/std/ostream (vprintf_nonunicode, vprintf_unicode)
(print, println): New functions.
* include/std/print: New file.
* src/c++23/Makefile.am: Add new source file.
* src/c++23/Makefile.in: Regenerate.
* src/c++23/print.cc: New file.
* testsuite/27_io/basic_ostream/print/1.cc: New test.
* testsuite/27_io/print/1.cc: New test.
* testsuite/27_io/print/2.cc: New test.
2023-12-14 23:23:34 +00:00
|
|
|
s += " \xa3 10.99 \xee\xdd";
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
std::u16string repl = u"\uFFFD";
|
|
|
|
const std::u16string s3 = s2 + u" " + repl + u" 10.99 " + repl + repl;
|
|
|
|
VERIFY( utf16_from_bytes(s) == s3 );
|
|
|
|
|
|
|
|
s = "\xc0\x80";
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( utf16_from_bytes(s) == repl + repl );
|
|
|
|
s = "\xc0\xae";
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( utf16_from_bytes(s) == repl + repl );
|
|
|
|
|
|
|
|
// Examples of U+FFFD substitution from Unicode standard.
|
|
|
|
std::u16string r4 = repl + repl + repl + repl;
|
|
|
|
s = "\xc0\xaf\xe0\x80\xbf\xf0\x81\x82\x41"; // Table 3-8
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( utf16_from_bytes(s) == r4 + r4 + u"\x41" );
|
|
|
|
s = "\xed\xa0\x80\xed\xbf\xbf\xed\xaf\x41"; // Table 3-9
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( utf16_from_bytes(s) == r4 + r4 + u"\x41" );
|
|
|
|
s = "\xf4\x91\x92\x93\xff\x41\x80\xbf\x42"; // Table 3-10
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( utf16_from_bytes(s) == r4 + repl + u"\x41" + repl + repl + u"\x42" );
|
|
|
|
s = "\xe1\x80\xe2\xf0\x91\x92\xf1\xbf\x41"; // Table 3-11
|
|
|
|
VERIFY( ! as_printed_to_terminal(s) );
|
|
|
|
VERIFY( utf16_from_bytes(s) == r4 + u"\x41" );
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
int main()
|
|
|
|
{
|
|
|
|
test_utf8_validation();
|
|
|
|
test_utf16_transcoding();
|
|
|
|
}
|