Find a file
Jonathan Wakely df0a668b78 libstdc++: Implement C++26 std::text_encoding (P1885R12) [PR113318]
This is another C++26 change, approved in Varna 2023. We require a new
static array of data that is extracted from the IANA Character Sets
database. A new Python script to generate a header from the IANA CSV
file is added.

The text_encoding class is basically just a pointer to an {ID,name} pair
in the static array. The aliases view is also just the same pointer (or
empty), and the view's iterator moves forwards and backwards in the
array while the array elements have the same ID (or to one element
further, for a past-the-end iterator).

Because those iterators refer to a global array that never goes out of
scope, there's no reason they should every produce undefined behaviour
or indeterminate values.  They should either have well-defined
behaviour, or abort. The overhead of ensuring those properties is pretty
low, so seems worth it.

This means that an aliases_view iterator should never be able to access
out-of-bounds. A non-value-initialized iterator always points to an
element of the static array even when not dereferenceable (the array has
unreachable entries at the start and end, which means that even a
past-the-end iterator for the last encoding in the array still points to
valid memory).  Dereferencing an iterator can always return a valid
array element, or "" for a non-dereferenceable iterator (but doing so
will abort when assertions are enabled).  In the language being proposed
for C++26, dereferencing an invalid iterator erroneously returns "".
Attempting to increment/decrement past the last/first element in the
view is erroneously a no-op, so aborts when assertions are enabled, and
doesn't change value otherwise.

Similarly, constructing a std::text_encoding with an invalid id (one
that doesn't have the value of an enumerator) erroneously behaves the
same as constructing with id::unknown, or aborts with assertions
enabled.

libstdc++-v3/ChangeLog:

	PR libstdc++/113318
	* acinclude.m4 (GLIBCXX_CONFIGURE): Add c++26 directory.
	(GLIBCXX_CHECK_TEXT_ENCODING): Define.
	* config.h.in: Regenerate.
	* configure: Regenerate.
	* configure.ac: Use GLIBCXX_CHECK_TEXT_ENCODING.
	* include/Makefile.am: Add new headers.
	* include/Makefile.in: Regenerate.
	* include/bits/locale_classes.h (locale::encoding): Declare new
	member function.
	* include/bits/unicode.h (__charset_alias_match): New function.
	* include/bits/text_encoding-data.h: New file.
	* include/bits/version.def (text_encoding): Define.
	* include/bits/version.h: Regenerate.
	* include/std/text_encoding: New file.
	* src/Makefile.am: Add new subdirectory.
	* src/Makefile.in: Regenerate.
	* src/c++26/Makefile.am: New file.
	* src/c++26/Makefile.in: New file.
	* src/c++26/text_encoding.cc: New file.
	* src/experimental/Makefile.am: Include c++26 convenience
	library.
	* src/experimental/Makefile.in: Regenerate.
	* python/libstdcxx/v6/printers.py (StdTextEncodingPrinter): New
	printer.
	* scripts/gen_text_encoding_data.py: New file.
	* testsuite/22_locale/locale/encoding.cc: New test.
	* testsuite/ext/unicode/charset_alias_match.cc: New test.
	* testsuite/std/text_encoding/cons.cc: New test.
	* testsuite/std/text_encoding/members.cc: New test.
	* testsuite/std/text_encoding/requirements.cc: New test.

Reviewed-by: Ulrich Drepper <drepper.fsp@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-01-17 11:49:11 +00:00
.github Minor formatting fix for newly-added file from previous commit 2023-11-01 19:28:56 -04:00
c++tools Update copyright years. 2024-01-03 12:19:35 +01:00
config Daily bump. 2024-01-12 00:17:54 +00:00
contrib Daily bump. 2024-01-12 00:17:54 +00:00
fixincludes Daily bump. 2023-11-23 00:18:14 +00:00
gcc testsuite: Add testcase for already fixed PR [PR110251] 2024-01-17 11:33:14 +01:00
gnattools Update Copyright year in ChangeLog files 2024-01-03 11:35:18 +01:00
gotools Daily bump. 2023-11-04 00:16:45 +00:00
include Daily bump. 2024-01-14 00:17:47 +00:00
INSTALL
libada Update copyright years. 2024-01-03 12:19:35 +01:00
libatomic Update copyright years. 2024-01-03 12:19:35 +01:00
libbacktrace Update copyright years. 2024-01-05 08:54:28 +01:00
libcc1 Daily bump. 2024-01-10 00:18:30 +00:00
libcody Update Copyright year in ChangeLog files 2024-01-03 11:35:18 +01:00
libcpp Daily bump. 2024-01-05 00:18:48 +00:00
libdecnumber Update copyright years. 2024-01-03 12:19:35 +01:00
libffi Daily bump. 2023-10-27 00:17:12 +00:00
libgcc Daily bump. 2024-01-13 00:18:48 +00:00
libgfortran Daily bump. 2024-01-16 00:18:46 +00:00
libgm2 Daily bump. 2024-01-06 00:18:04 +00:00
libgo libgo: update configure.ac to upstream GCC 2023-11-30 13:23:53 -08:00
libgomp openmp: Add OpenMP _BitInt support [PR113409] 2024-01-17 10:47:31 +01:00
libgrust Daily bump. 2024-01-17 00:21:29 +00:00
libiberty Daily bump. 2024-01-14 00:17:47 +00:00
libitm Daily bump. 2024-01-04 00:18:45 +00:00
libobjc Update copyright years. 2024-01-03 12:19:35 +01:00
libphobos Update copyright years. 2024-01-05 08:54:28 +01:00
libquadmath Daily bump. 2024-01-04 00:18:45 +00:00
libsanitizer Sanitizer/MIPS: Use $t9 for preemptible function call 2024-01-17 17:03:08 +08:00
libssp Update copyright years. 2024-01-03 12:19:35 +01:00
libstdc++-v3 libstdc++: Implement C++26 std::text_encoding (P1885R12) [PR113318] 2024-01-17 11:49:11 +00:00
libvtv Update copyright years. 2024-01-03 12:19:35 +01:00
lto-plugin Update copyright years. 2024-01-03 12:19:35 +01:00
maintainer-scripts Daily bump. 2023-11-14 12:23:39 +00:00
zlib Daily bump. 2023-10-23 00:16:43 +00:00
.dir-locals.el
.gitattributes
.gitignore *: add modern gettext 2023-11-14 00:47:11 +01:00
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2024-01-17 00:21:29 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in LoongArch: Reimplement multilib build option handling. 2023-09-15 10:42:12 +08:00
config.guess
config.rpath
config.sub
configure build: Add libgrust as compilation modules 2023-12-14 13:58:57 +01:00
configure.ac build: Add libgrust as compilation modules 2023-12-14 13:58:57 +01:00
COPYING
COPYING.LIB
COPYING.RUNTIME
COPYING3
COPYING3.LIB
depcomp
install-sh
libtool-ldflags
libtool.m4 Build: fix error in fixinclude configure 2023-11-22 11:54:33 +01:00
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
MAINTAINERS Add myself to the DCO section 2024-01-15 14:20:11 -08:00
Makefile.def gccrs: Fix missing build dependency 2024-01-16 16:23:02 +01:00
Makefile.in gccrs: Fix missing build dependency 2024-01-16 16:23:02 +01:00
Makefile.tpl Pass GUILE down to subdirectories 2024-01-09 08:02:31 -07:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt SECURITY.txt: Drop "exploitable" in reference to hardening issues 2024-01-09 10:49:01 -05:00
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.