gcc/libcpp
Jakub Jelinek 194825f206 c++: Implement C++26 P1854R4 - Making non-encodable string literals ill-formed [PR110341]
This paper voted in as DR makes some multi-character literals ill-formed.
'abcd' stays valid, but e.g. 'á' is newly invalid in UTF-8 exec charset
while valid e.g. in ISO-8859-1, because it is a single character which needs
2 bytes to be encoded.

The following patch does that by checking (only pedantically, especially
because it is a DR) if we'd emit a -Wmultichar warning because character
constant has more than one byte in it whether the number of source characters
is equal to the number of bytes in the multichar string.
If it is, it is normal multi-character literal constant
and is diagnosed normally with -Wmultichar, otherwise at least one of the
c-chars in the sequence was encoded as 2+ bytes.

2023-11-14  Jakub Jelinek  <jakub@redhat.com>

	PR c++/110341
libcpp/
	* charset.cc: Implement C++26 P1854R4 - Making non-encodable string
	literals ill-formed.
	(one_count_chars, convert_count_chars, count_source_chars): New
	functions.
	(narrow_str_to_charconst): Change last arg type from cpp_ttype to
	const cpp_token *.  For C++ if pedantic and i > 1 in CPP_CHAR
	interpret token also as CPP_STRING32 and if number of characters
	in the CPP_STRING32 is larger than number of bytes in CPP_CHAR,
	pedwarn on it.  Make the diagnostics more detailed.
	(wide_str_to_charconst): Change last arg type from cpp_ttype to
	const cpp_token *.  Make the diagnostics more detailed.
	(cpp_interpret_charconst): Adjust narrow_str_to_charconst and
	wide_str_to_charconst callers.
gcc/testsuite/
	* g++.dg/cpp26/literals1.C: New test.
	* g++.dg/cpp26/literals2.C: New test.
	* g++.dg/cpp23/wchar-multi1.C: Adjust expected diagnostic wordings.
	* g++.dg/cpp23/wchar-multi2.C: Likewise.
	* gcc.dg/c23-utf8char-3.c: Likewise.
	* gcc.dg/cpp/charconst-4.c: Likewise.
	* gcc.dg/cpp/charconst.c: Likewise.
	* gcc.dg/cpp/if-2.c: Likewise.
	* gcc.dg/utf16-4.c: Likewise.
	* gcc.dg/utf32-4.c: Likewise.
	* g++.dg/cpp1z/utf8-neg.C: Likewise.
	* g++.dg/cpp2a/ucn2.C: Likewise.
	* g++.dg/ext/utf16-4.C: Likewise.
	* g++.dg/ext/utf32-4.C: Likewise.
2023-11-14 18:28:34 +01:00
..
include diagnostics: cleanups to diagnostic-show-locus.cc 2023-11-09 17:22:52 -05:00
po Daily bump. 2023-05-10 00:17:49 +00:00
aclocal.m4 *: add modern gettext 2023-11-14 00:47:11 +01:00
ChangeLog Daily bump. 2023-11-14 12:23:39 +00:00
ChangeLog.jit
charset.cc c++: Implement C++26 P1854R4 - Making non-encodable string literals ill-formed [PR110341] 2023-11-14 18:28:34 +01:00
combining-chars.inc diagnostics: add support for "text art" diagrams 2023-06-21 21:49:00 -04:00
config.in libcpp: Regenerate config.in 2023-11-14 01:02:22 +01:00
configure *: add modern gettext 2023-11-14 00:47:11 +01:00
configure.ac configure: Implement --enable-host-pie 2023-06-15 16:51:27 -04:00
directives.cc c: Refer more consistently to C23 not C2X 2023-11-07 14:20:30 +00:00
errors.cc Update copyright years. 2023-01-16 11:52:17 +01:00
expr.cc c: Refer more consistently to C23 not C2X 2023-11-07 14:20:30 +00:00
files.cc libcpp: Fix ICE on #include after a line marker directive [PR61474] 2023-09-20 16:44:24 -04:00
generated_cpp_wcwidth.h libcpp: Update cpp_wcwidth() to Unicode 15 2023-03-13 07:40:50 -04:00
identifiers.cc libcpp: Improve the diagnostic for poisoned identifiers [PR36887] 2023-10-23 18:35:26 -04:00
init.cc c: Refer more consistently to C23 not C2X 2023-11-07 14:20:30 +00:00
internal.h libcpp: Improve the diagnostic for poisoned identifiers [PR36887] 2023-10-23 18:35:26 -04:00
lex.cc c: Refer more consistently to C23 not C2X 2023-11-07 14:20:30 +00:00
line-map.cc diagnostics: cleanups to diagnostic-show-locus.cc 2023-11-09 17:22:52 -05:00
location-example.txt
macro.cc c: Refer more consistently to C23 not C2X 2023-11-07 14:20:30 +00:00
Makefile.in Update copyright years. 2023-01-16 11:52:17 +01:00
makeucnid.cc libcpp: Update Unicode copyright years 2023-03-16 10:19:04 +01:00
makeuname2c.cc libcpp: Update Unicode copyright years 2023-03-16 10:19:04 +01:00
mkdeps.cc p1689r5: initial support 2023-09-19 17:32:23 -04:00
pch.cc libcpp: Improve location for macro names [PR66290] 2023-06-20 16:58:12 -04:00
printable-chars.inc diagnostics: add support for "text art" diagrams 2023-06-21 21:49:00 -04:00
symtab.cc Update copyright years. 2023-01-16 11:52:17 +01:00
system.h Update copyright years. 2023-01-16 11:52:17 +01:00
traditional.cc Update copyright years. 2023-01-16 11:52:17 +01:00
ucnid.h libcpp: Update Unicode copyright years 2023-03-16 10:19:04 +01:00
ucnid.tab Update copyright years. 2023-01-16 11:52:17 +01:00
uname2c.h libcpp: Update Unicode copyright years 2023-03-16 10:19:04 +01:00