This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.
Support is communicated through the following three new flags:
- `-fdeps-format=` specifies the format for the output. Currently named
`p1689r5`.
- `-fdeps-file=` specifies the path to the file to write the format to.
- `-fdeps-target=` specifies the `.o` that will be written for the TU
that is scanned. This is required so that the build system can
correlate the dependency output with the actual compilation that will
occur.
CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.
Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.
[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst
TODO:
- header-unit information fields
Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.
- non-utf8 paths
The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously representable in UTF-8" case).
- figure out why junk gets placed at the end of the file
Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.
libcpp/
* include/cpplib.h: Add cpp_fdeps_format enum.
(cpp_options): Add fdeps_format field
(cpp_finish): Add structured dependency fdeps_stream parameter.
* include/mkdeps.h (deps_add_module_target): Add flag for
whether a module is exported or not.
(fdeps_add_target): Add function.
(deps_write_p1689r5): Add function.
* init.cc (cpp_finish): Add new preprocessor parameter used for C++
module tracking.
* mkdeps.cc (mkdeps): Implement P1689R5 output.
gcc/
* doc/invoke.texi: Document -fdeps-format=, -fdeps-file=, and
-fdeps-target= flags.
* gcc.cc: add defaults for -fdeps-target= and -fdeps-file= when
only -fdeps-format= is specified.
* json.h: Add a TODO item to refactor out to share with
`libcpp/mkdeps.cc`.
gcc/c-family/
* c-opts.cc (c_common_handle_option): Add fdeps_file variable and
-fdeps-format=, -fdeps-file=, and -fdeps-target= parsing.
* c.opt: Add -fdeps-format=, -fdeps-file=, and -fdeps-target=
flags.
gcc/cp/
* module.cc (preprocessed_module): Pass whether the module is
exported to dependency tracking.
gcc/testsuite/
* g++.dg/modules/depflags-f-MD.C: New test.
* g++.dg/modules/depflags-f.C: New test.
* g++.dg/modules/depflags-fi.C: New test.
* g++.dg/modules/depflags-fj-MD.C: New test.
* g++.dg/modules/depflags-fj.C: New test.
* g++.dg/modules/depflags-fjo-MD.C: New test.
* g++.dg/modules/depflags-fjo.C: New test.
* g++.dg/modules/depflags-fo-MD.C: New test.
* g++.dg/modules/depflags-fo.C: New test.
* g++.dg/modules/depflags-j-MD.C: New test.
* g++.dg/modules/depflags-j.C: New test.
* g++.dg/modules/depflags-jo-MD.C: New test.
* g++.dg/modules/depflags-jo.C: New test.
* g++.dg/modules/depflags-o-MD.C: New test.
* g++.dg/modules/depflags-o.C: New test.
* g++.dg/modules/p1689-1.C: New test.
* g++.dg/modules/p1689-1.exp.ddi: New test expectation.
* g++.dg/modules/p1689-2.C: New test.
* g++.dg/modules/p1689-2.exp.ddi: New test expectation.
* g++.dg/modules/p1689-3.C: New test.
* g++.dg/modules/p1689-3.exp.ddi: New test expectation.
* g++.dg/modules/p1689-4.C: New test.
* g++.dg/modules/p1689-4.exp.ddi: New test expectation.
* g++.dg/modules/p1689-5.C: New test.
* g++.dg/modules/p1689-5.exp.ddi: New test expectation.
* g++.dg/modules/modules.exp: Load new P1689 library routines.
* g++.dg/modules/test-p1689.py: New tool for validating P1689 output.
* lib/modules.exp: Support for validating P1689 outputs.
Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
This patch adds the C FE support, c-family support, small libcpp change
so that 123wb and 42uwb suffixes are handled plus glimits.h change
to define BITINT_MAXWIDTH macro.
The previous patches really do nothing without this, which enables
all the support.
2023-09-06 Jakub Jelinek <jakub@redhat.com>
PR c/102989
gcc/
* glimits.h (BITINT_MAXWIDTH): Define if __BITINT_MAXWIDTH__ is
predefined.
gcc/c-family/
* c-common.cc (c_common_reswords): Add _BitInt as keyword.
(unsafe_conversion_p): Handle BITINT_TYPE like INTEGER_TYPE.
(c_common_signed_or_unsigned_type): Handle BITINT_TYPE.
(c_common_truthvalue_conversion, c_common_get_alias_set,
check_builtin_function_arguments): Handle BITINT_TYPE like
INTEGER_TYPE.
(sync_resolve_size): Add ORIG_FORMAT argument. If
FETCH && !ORIG_FORMAT, type is BITINT_TYPE, return -1 if size isn't
one of 1, 2, 4, 8 or 16 or if it is 16 but TImode is not supported.
(atomic_bitint_fetch_using_cas_loop): New function.
(resolve_overloaded_builtin): Adjust sync_resolve_size caller. If
-1 is returned, use atomic_bitint_fetch_using_cas_loop to lower it.
Formatting fix.
(keyword_begins_type_specifier): Handle RID_BITINT.
* c-common.h (enum rid): Add RID_BITINT enumerator.
* c-cppbuiltin.cc (c_cpp_builtins): For C call
targetm.c.bitint_type_info and predefine __BITINT_MAXWIDTH__
and for -fbuilding-libgcc also __LIBGCC_BITINT_LIMB_WIDTH__ and
__LIBGCC_BITINT_ORDER__ macros if _BitInt is supported.
* c-lex.cc (interpret_integer): Handle CPP_N_BITINT.
* c-pretty-print.cc (c_pretty_printer::simple_type_specifier,
c_pretty_printer::direct_abstract_declarator,
c_pretty_printer::direct_declarator, c_pretty_printer::declarator):
Handle BITINT_TYPE.
(pp_c_integer_constant): Handle printing of large precision wide_ints
which would buffer overflow digit_buffer.
* c-warn.cc (conversion_warning, warnings_for_convert_and_check,
warnings_for_convert_and_check): Handle BITINT_TYPE like
INTEGER_TYPE.
gcc/c/
* c-convert.cc (c_convert): Handle BITINT_TYPE like INTEGER_TYPE.
* c-decl.cc (check_bitfield_type_and_width): Allow BITINT_TYPE
bit-fields.
(finish_struct): Prefer to use BITINT_TYPE for BITINT_TYPE bit-fields
if possible.
(declspecs_add_type): Formatting fixes. Handle cts_bitint. Adjust
for added union in *specs. Handle RID_BITINT.
(finish_declspecs): Handle cts_bitint. Adjust for added union
in *specs.
* c-parser.cc (c_keyword_starts_typename, c_token_starts_declspecs,
c_parser_declspecs, c_parser_gnu_attribute_any_word): Handle
RID_BITINT.
(c_parser_omp_clause_schedule): Handle BITINT_TYPE like INTEGER_TYPE.
* c-tree.h (enum c_typespec_keyword): Mention _BitInt in comment.
Add cts_bitint enumerator.
(struct c_declspecs): Move int_n_idx and floatn_nx_idx into a union
and add bitint_prec there as well.
* c-typeck.cc (c_common_type, comptypes_internal):
Handle BITINT_TYPE.
(perform_integral_promotions): Promote BITINT_TYPE bit-fields to
their declared type.
(build_array_ref, build_unary_op, build_conditional_expr,
build_c_cast, convert_for_assignment, digest_init, build_binary_op):
Handle BITINT_TYPE.
* c-fold.cc (c_fully_fold_internal): Handle BITINT_TYPE like
INTEGER_TYPE.
* c-aux-info.cc (gen_type): Handle BITINT_TYPE.
libcpp/
* expr.cc (interpret_int_suffix): Handle wb and WB suffixes.
* include/cpplib.h (CPP_N_BITINT): Define.
It seems prudent to add C++26 now that the first C++26 papers have been
approved. I followed commit r11-6920 as well as r8-3237.
Since C++23 is essentially finished and its __cplusplus value has
settled to 202302L, I've updated cpp_init_builtins and marked
-std=c++2b Undocumented and made -std=c++23 no longer Undocumented.
As for __cplusplus, I've chosen 202400L:
$ xg++ -std=c++26 -dM -E -x c++ - < /dev/null | grep cplusplus
#define __cplusplus 202400L
I've verified the patch with a simple test, exercising the new
directives. Don't forget to update your GXX_TESTSUITE_STDS!
This patch does not add -Wc++26-extensions.
gcc/c-family/ChangeLog:
* c-common.h (cxx_dialect): Add cxx26 as a dialect.
* c-opts.cc (set_std_cxx26): New.
(c_common_handle_option): Set options when -std={c,gnu}++2{c,6} is
enabled.
(c_common_post_options): Adjust comments.
* c.opt: Add options for -std=c++26, std=c++2c, -std=gnu++26,
and -std=gnu++2c.
(std=c++2b): Mark as Undocumented.
(std=c++23): No longer Undocumented.
gcc/ChangeLog:
* doc/cpp.texi (__cplusplus): Document value for -std=c++26 and
-std=gnu++26. Document that for C++23, its value is 202302L.
* doc/invoke.texi: Document -std=c++26 and -std=gnu++26.
* dwarf2out.cc (highest_c_language): Handle GNU C++26.
(gen_compile_unit_die): Likewise.
libcpp/ChangeLog:
* include/cpplib.h (c_lang): Add CXX26 and GNUCXX26.
* init.cc (lang_defaults): Add rows for CXX26 and GNUCXX26.
(cpp_init_builtins): Set __cplusplus to 202400L for C++26.
Set __cplusplus to 202302L for C++23.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_c++23): Return
1 also if check_effective_target_c++26.
(check_effective_target_c++23_down): New.
(check_effective_target_c++26_only): New.
(check_effective_target_c++26): New.
* g++.dg/cpp23/cplusplus.C: Adjust expected value.
* g++.dg/cpp26/cplusplus.C: New test.
Existing text output in GCC has to be implemented by writing
sequentially to a pretty_printer instance. This makes it
hard to implement some kinds of diagnostic output (see e.g.
diagnostic-show-locus.cc).
This patch adds more flexible ways of creating text output:
- a canvas class, which can be "painted" to via random-access (rather
that sequentially)
- a table class for 2D grid layout, supporting items that span
multiple rows/columns
- a widget class for organizing diagrams hierarchically.
The patch also expands GCC's diagnostics subsystem so that diagnostics
can have "text art" diagrams - think ASCII art, but potentially
including some Unicode characters, such as box-drawing chars.
The new code is in a new "gcc/text-art" subdirectory and "text_art"
namespace.
The patch adds a new "-fdiagnostics-text-art-charset=VAL" option, with
values:
- "none": don't emit diagrams (added to -fdiagnostics-plain-output)
- "ascii": use pure ASCII in diagrams
- "unicode": allow for conservative use of unicode drawing characters
(such as box-drawing characters).
- "emoji" (the default): as "unicode", but potentially allow for
conservative use of emoji in the output (such as U+26A0 WARNING SIGN).
I made it possible to disable emoji separately from unicode as I believe
there's a generation gap in acceptance of these characters (some older
programmers have a visceral reaction against them, whereas younger
programmers may have no problem with them).
Diagrams are emitted to stderr by default. With SARIF output they are
captured as a location in "relatedLocations", with the diagram as a
code block in Markdown within a "markdown" property of a message.
This patch doesn't add any such diagram usage to GCC, saving that for
followups, apart from adding a plugin to the test suite to exercise the
functionality.
contrib/ChangeLog:
* unicode/gen-box-drawing-chars.py: New file.
* unicode/gen-combining-chars.py: New file.
* unicode/gen-printable-chars.py: New file.
gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add text-art/box-drawing.o,
text-art/canvas.o, text-art/ruler.o, text-art/selftests.o,
text-art/style.o, text-art/styled-string.o, text-art/table.o,
text-art/theme.o, and text-art/widget.o.
* color-macros.h (COLOR_FG_BRIGHT_BLACK): New.
(COLOR_FG_BRIGHT_RED): New.
(COLOR_FG_BRIGHT_GREEN): New.
(COLOR_FG_BRIGHT_YELLOW): New.
(COLOR_FG_BRIGHT_BLUE): New.
(COLOR_FG_BRIGHT_MAGENTA): New.
(COLOR_FG_BRIGHT_CYAN): New.
(COLOR_FG_BRIGHT_WHITE): New.
(COLOR_BG_BRIGHT_BLACK): New.
(COLOR_BG_BRIGHT_RED): New.
(COLOR_BG_BRIGHT_GREEN): New.
(COLOR_BG_BRIGHT_YELLOW): New.
(COLOR_BG_BRIGHT_BLUE): New.
(COLOR_BG_BRIGHT_MAGENTA): New.
(COLOR_BG_BRIGHT_CYAN): New.
(COLOR_BG_BRIGHT_WHITE): New.
* common.opt (fdiagnostics-text-art-charset=): New option.
(diagnostic-text-art.h): New SourceInclude.
(diagnostic_text_art_charset) New Enum and EnumValues.
* configure: Regenerate.
* configure.ac (gccdepdir): Add text-art to loop.
* diagnostic-diagram.h: New file.
* diagnostic-format-json.cc (json_emit_diagram): New.
(diagnostic_output_format_init_json): Wire it up to
context->m_diagrams.m_emission_cb.
* diagnostic-format-sarif.cc: Include "diagnostic-diagram.h" and
"text-art/canvas.h".
(sarif_result::on_nested_diagnostic): Move code to...
(sarif_result::add_related_location): ...this new function.
(sarif_result::on_diagram): New.
(sarif_builder::emit_diagram): New.
(sarif_builder::make_message_object_for_diagram): New.
(sarif_emit_diagram): New.
(diagnostic_output_format_init_sarif): Set
context->m_diagrams.m_emission_cb to sarif_emit_diagram.
* diagnostic-text-art.h: New file.
* diagnostic.cc: Include "diagnostic-text-art.h",
"diagnostic-diagram.h", and "text-art/theme.h".
(diagnostic_initialize): Initialize context->m_diagrams and
call diagnostics_text_art_charset_init.
(diagnostic_finish): Clean up context->m_diagrams.m_theme.
(diagnostic_emit_diagram): New.
(diagnostics_text_art_charset_init): New.
* diagnostic.h (text_art::theme): New forward decl.
(class diagnostic_diagram): Likewise.
(diagnostic_context::m_diagrams): New field.
(diagnostic_emit_diagram): New decl.
* doc/invoke.texi (Diagnostic Message Formatting Options): Add
-fdiagnostics-text-art-charset=.
(-fdiagnostics-plain-output): Add
-fdiagnostics-text-art-charset=none.
* gcc.cc: Include "diagnostic-text-art.h".
(driver_handle_option): Handle OPT_fdiagnostics_text_art_charset_.
* opts-common.cc (decode_cmdline_options_to_array): Add
"-fdiagnostics-text-art-charset=none" to expanded_args for
-fdiagnostics-plain-output.
* opts.cc: Include "diagnostic-text-art.h".
(common_handle_option): Handle OPT_fdiagnostics_text_art_charset_.
* pretty-print.cc (pp_unicode_character): New.
* pretty-print.h (pp_unicode_character): New decl.
* selftest-run-tests.cc: Include "text-art/selftests.h".
(selftest::run_tests): Call text_art_tests.
* text-art/box-drawing-chars.inc: New file, generated by
contrib/unicode/gen-box-drawing-chars.py.
* text-art/box-drawing.cc: New file.
* text-art/box-drawing.h: New file.
* text-art/canvas.cc: New file.
* text-art/canvas.h: New file.
* text-art/ruler.cc: New file.
* text-art/ruler.h: New file.
* text-art/selftests.cc: New file.
* text-art/selftests.h: New file.
* text-art/style.cc: New file.
* text-art/styled-string.cc: New file.
* text-art/table.cc: New file.
* text-art/table.h: New file.
* text-art/theme.cc: New file.
* text-art/theme.h: New file.
* text-art/types.h: New file.
* text-art/widget.cc: New file.
* text-art/widget.h: New file.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-text-art-ascii-bw.c: New test.
* gcc.dg/plugin/diagnostic-test-text-art-ascii-color.c: New test.
* gcc.dg/plugin/diagnostic-test-text-art-none.c: New test.
* gcc.dg/plugin/diagnostic-test-text-art-unicode-bw.c: New test.
* gcc.dg/plugin/diagnostic-test-text-art-unicode-color.c: New test.
* gcc.dg/plugin/diagnostic_plugin_test_text_art.c: New test plugin.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add them.
libcpp/ChangeLog:
* charset.cc (get_cppchar_property): New function template, based
on...
(cpp_wcwidth): ...this function. Rework to use the above.
Include "combining-chars.inc".
(cpp_is_combining_char): New function
Include "printable-chars.inc".
(cpp_is_printable_char): New function
* combining-chars.inc: New file, generated by
contrib/unicode/gen-combining-chars.py.
* include/cpplib.h (cpp_is_combining_char): New function decl.
(cpp_is_printable_char): New function decl.
* printable-chars.inc: New file, generated by
contrib/unicode/gen-printable-chars.py.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
PR analyzer/109098 notes that the SARIF spec mandates that .sarif
files are UTF-8 encoded, but -fdiagnostics-format=sarif-file naively
assumes that the source files are UTF-8 encoded when quoting source
artefacts in the .sarif output, which can lead to us writing out
.sarif files with non-UTF-8 bytes in them (which break my reporting
scripts).
The root cause is that sarif_builder::maybe_make_artifact_content_object
was using maybe_read_file to load the file content as bytes, and
assuming they were UTF-8 encoded.
This patch reworks both overloads of this function (one used for the
whole file, the other for snippets of quoted lines) so that they go
through input.cc's file cache, which attempts to decode the input files
according to the input charset, and then encode as UTF-8. They also
check that the result actually is UTF-8, for cases where the input
charset is missing, or incorrectly specified, and omit the quoted
source for such awkward cases.
Doing so fixes all of the cases I've encountered.
The patch adds a new:
{ dg-final { verify-sarif-file } }
directive to all SARIF test cases in the test suite, which verifies
that the output is UTF-8 encoded, and is valid JSON. In particular
it verifies that when we complain about encoding problems, the .sarif
report we emit is itself correctly encoded.
gcc/ChangeLog:
PR analyzer/109098
* diagnostic-format-sarif.cc (read_until_eof): Delete.
(maybe_read_file): Delete.
(sarif_builder::maybe_make_artifact_content_object): Use
get_source_file_content rather than maybe_read_file.
Reject it if it's not valid UTF-8.
* input.cc (file_cache_slot::get_full_file_content): New.
(get_source_file_content): New.
(selftest::check_cpp_valid_utf8_p): New.
(selftest::test_cpp_valid_utf8_p): New.
(selftest::input_cc_tests): Call selftest::test_cpp_valid_utf8_p.
* input.h (get_source_file_content): New prototype.
gcc/testsuite/ChangeLog:
PR analyzer/109098
* c-c++-common/diagnostic-format-sarif-file-1.c: Add
verify-sarif-file directive.
* c-c++-common/diagnostic-format-sarif-file-2.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-3.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-4.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: New
test case, adapted from Wbidi-chars-1.c.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c:
New test case.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-2.c:
New test case.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-3.c:
New test case, adapted from cpp/Winvalid-utf8-1.c.
* c-c++-common/diagnostic-format-sarif-file-valid-CP850.c: New
test case, adapted from gcc.dg/diagnostic-input-charset-1.c.
* gcc.dg/plugin/crash-test-ice-sarif.c: Add verify-sarif-file
directive.
* gcc.dg/plugin/crash-test-write-though-null-sarif.c: Likewise.
* gcc.dg/plugin/diagnostic-test-paths-5.c: Likewise.
* lib/scansarif.exp (verify-sarif-file): New procedure.
* lib/verify-sarif-file.py: New support script.
libcpp/ChangeLog:
PR analyzer/109098
* charset.cc (cpp_valid_utf8_p): New function.
* include/cpplib.h (cpp_valid_utf8_p): New prototype.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
When a GTY'ed struct is streamed to PCH, any plain char* pointers it contains
(whether they live in GC-controlled memory or not) will be marked for PCH
output by the routine gt_pch_note_object in ggc-common.cc. This routine
special-cases plain char* strings, and in particular it uses strlen() to get
their length. Thus it does not handle strings with embedded null bytes, but it
is possible for something PCH cares about (such as a string literal token in a
macro definition) to contain such embedded nulls. To fix that up, add a new
GTY option "string_length" so that gt_pch_note_object can be informed the
actual length it ought to use, and use it in the relevant libcpp structs
(cpp_string and ht_identifier) accordingly.
gcc/ChangeLog:
* gengtype.cc (output_escaped_param): Add missing const.
(get_string_option): Add missing check for option type.
(walk_type): Support new "string_length" GTY option.
(write_types_process_field): Likewise.
* ggc-common.cc (gt_pch_note_object): Add optional length argument.
* ggc.h (gt_pch_note_object): Adjust prototype for new argument.
(gt_pch_n_S2): Declare...
* stringpool.cc (gt_pch_n_S2): ...new function.
* doc/gty.texi: Document new GTY((string_length)) option.
libcpp/ChangeLog:
* include/cpplib.h (struct cpp_string): Use new "string_length" GTY.
* include/symtab.h (struct ht_identifier): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/pch/pch-string-nulls.C: New test.
* g++.dg/pch/pch-string-nulls.Hs: New test.
C2x has, like C++, adopted rules for identifiers based directly on an
unversioned normative reference to Unicode. Make libcpp follow those
rules for c2x / gnu2x standards (this involves bringing back a flag
separate from the C++ one for whether to use these identifier rules,
but this time enabled for all C++ language versions since that was the
conclusion adopted for C++ identifier handling).
There is one change here that affects C++. I believe the new
normative requirement for NFC only applies to identifiers, not to the
use of identifier-continue characters in pp-numbers, where there is no
such requirement and so the diagnostic ought to be a warning not a
pedwarn in pp-numbers, and that this is the case for both C and C++.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
libcpp/
* charset.cc (ucn_valid_in_identifier): Check xid_identifiers not
cplusplus to determine whether to use CXX23 and NXX23 flags.
* include/cpplib.h (struct cpp_options): Add xid_identifiers.
* init.cc (struct lang_flags, lang_defaults): Add xid_identifiers.
(cpp_set_lang): Set xid_identifiers.
* lex.cc (warn_about_normalization): Add parameter identifier.
Only pedwarn about non-NFC for identifiers, not pp-numbers.
(_cpp_lex_direct): Update calls to warn_about_normalization.
gcc/testsuite/
* gcc.dg/cpp/c2x-ucnid-1-utf8.c, gcc.dg/cpp/c2x-ucnid-1.c: New
tests.
Here is a complete patch to add std::bfloat16_t support on
x86 (AArch64 and ARM left for later). Almost no BFmode optabs
are added by the patch, so for binops/unops it extends to SFmode
first and then truncates back to BFmode.
For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations
of all those conversions so that we avoid double rounding, for
BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much
it emits BFmode -> SFmode conversion first and then converts to the even
wider mode, neither step should be imprecise.
For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion
and then SFmode -> HFmode, because neither format is subset or superset
of the other, while SFmode is superset of both.
expr.cc then contains a -ffast-math optimization of the BF -> SF and
SF -> BF conversions if we don't optimize for space (and for the latter
if -frounding-math isn't enabled either).
For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16
but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math
|| !flag_unsafe_math_optimizations, because I think the insn doesn't
raise on sNaNs, hardcodes round to nearest and flushes denormals to zero.
By default (unless x86 -fexcess-precision=16) we use float excess
precision for BFmode, so truncate only on explicit casts and assignments.
The patch introduces a single __bf16 builtin - __builtin_nansf16b,
because (__bf16) __builtin_nansf ("") will drop the sNaN into qNaN,
and uses f16b suffix instead of bf16 because there would be ambiguity on
log vs. logb - __builtin_logbf16 could be either log with bf16 suffix
or logb with f16 suffix. In other cases libstdc++ should mostly use
__builtin_*f for std::bfloat16_t overloads (we have a problem with
std::nextafter though but that one we have also for std::float16_t).
2022-10-14 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
* tree.h (bfloat16_type_node): Define.
* tree.cc (excess_precision_type): Promote bfloat16_type_mode
like float16_type_mode.
(build_common_tree_nodes): Initialize bfloat16_type_node if
BFmode is supported.
* expmed.h (maybe_expand_shift): Declare.
* expmed.cc (maybe_expand_shift): No longer static.
* expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
conversions. If there is no optab, handle BF -> {DF,XF,TF,HF}
conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
-ffast-math generic implementation for BF -> SF and SF -> BF
conversions.
* builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New.
* builtins.def (BUILT_IN_NANSF16B): New builtin.
* fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B.
* config/i386/i386.cc (classify_argument): Handle E_BCmode.
(ix86_libgcc_floating_mode_supported_p): Also return true for BFmode
for -msse2.
(ix86_mangle_type): Mangle BFmode as DF16b.
(ix86_invalid_conversion, ix86_invalid_unary_op,
ix86_invalid_binary_op): Remove.
(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP,
TARGET_INVALID_BINARY_OP): Don't redefine.
* config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove.
(ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than
ix86_bf16_type_node, only create it if still NULL.
* config/i386/i386-builtin-types.def (BFLOAT16): Likewise.
* config/i386/i386.md (cbranchbf4, cstorebf4): New expanders.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node,
predefine __BFLT16_*__ macros and for C++23 also
__STDCPP_BFLOAT16_T__. Predefine bfloat16_type_node related
macros for -fbuilding-libgcc.
* c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16.
gcc/c/
* c-typeck.cc (convert_arguments): Don't promote __bf16 to
double.
gcc/cp/
* cp-tree.h (extended_float_type_p): Return true for
bfloat16_type_node.
* typeck.cc (cp_compare_floating_point_conversion_ranks): Set
extended{1,2} if mv{1,2} is bfloat16_type_node. Adjust comment.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_bfloat16,
check_effective_target_bfloat16_runtime, add_options_for_bfloat16):
New.
* gcc.dg/torture/bfloat16-basic.c: New test.
* gcc.dg/torture/bfloat16-builtin.c: New test.
* gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test.
* gcc.dg/torture/bfloat16-complex.c: New test.
* gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable
from bfloat16-builtin-issignaling-1.c.
* gcc.dg/torture/floatn-basic.h: Allow to be includable from
bfloat16-basic.c.
* gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected
diagnostics.
* gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise.
* gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise.
* g++.target/i386/bfloat_cpp_typecheck.C: Likewise.
libcpp/
* include/cpplib.h (CPP_N_BFLOAT16): Define.
* expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for
C++.
libgcc/
* config/i386/t-softfp (softfp_extensions): Add bfsf.
(softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf.
(CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c,
CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add
-msse2.
* config/i386/libgcc-glibc.ver (GCC_13.0.0): Export
__extendbfsf2 and __trunc{s,d,x,t,h}fbf2.
* config/i386/sfp-machine.h (_FP_NANSIGN_B): Define.
* config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define.
* config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define.
* soft-fp/brain.h: New file.
* soft-fp/truncsfbf2.c: New file.
* soft-fp/truncdfbf2.c: New file.
* soft-fp/truncxfbf2.c: New file.
* soft-fp/trunctfbf2.c: New file.
* soft-fp/trunchfbf2.c: New file.
* soft-fp/truncbfhf2.c: New file.
* soft-fp/extendbfsf2.c: New file.
libiberty/
* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment.
* cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t
entry.
(cplus_demangle_type): Demangle DF16b.
* testsuite/demangle-expected (_Z3xxxDF16b): New test.
C2x follows C++ in making alignas, alignof, bool, false,
static_assert, thread_local and true keywords; implement this
accordingly. This implementation makes them normal keywords in C2x
mode just like any other keyword (C2x leaves open the possibility of
implementation using predefined macros instead - thus, there aren't
any testcases asserting that they aren't macros). As in C++ and
previous versions of C, true and false are handled like signed 1 and 0
in #if (there was an intermediate state in some C2x drafts where they
had different macro expansions that were unsigned in #if).
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
As with the removal of unprototyped functions, this change has a high
risk of breaking some old code and people doing GNU/Linux distribution
builds may wish to see how much is broken in a build with a -std=gnu2x
default.
gcc/
* ginclude/stdalign.h [defined __STDC_VERSION__ &&
__STDC_VERSION__ > 201710L]: Disable all content.
* ginclude/stdbool.h [defined __STDC_VERSION__ && __STDC_VERSION__
> 201710L] (bool, true, false): Do not define.
gcc/c-family/
* c-common.cc (c_common_reswords): Use D_C2X instead of D_CXXONLY
for alignas, alignof, bool, false, static_assert, thread_local and
true.
gcc/c/
* c-parser.cc (c_parser_static_assert_declaration_no_semi)
(c_parser_alignas_specifier, c_parser_alignof_expression): Allow
for C2x spellings of keywords.
(c_parser_postfix_expression): Handle RID_TRUE and RID_FALSE.
gcc/testsuite/
* gcc.dg/c11-keywords-1.c, gcc.dg/c2x-align-1.c,
gcc.dg/c2x-align-6.c, gcc.dg/c2x-bool-2.c,
gcc.dg/c2x-static-assert-3.c, gcc.dg/c2x-static-assert-4.c,
gcc.dg/c2x-thread-local-1.c: New tests.
* gcc.dg/c2x-bool-1.c: Update expectations.
libcpp/
* include/cpplib.h (struct cpp_options): Add true_false.
* expr.cc (eval_token): Check true_false not cplusplus to
determine whether to handle true and false keywords.
* init.cc (struct lang_flags): Add true_false.
(lang_defaults): Update.
(cpp_set_lang): Set true_false.
On Tue, Aug 30, 2022 at 09:10:37PM +0000, Joseph Myers wrote:
> I'm seeing build failures of glibc for powerpc64, as illustrated by the
> following C code:
>
> #if 0
> \NARG
> #endif
>
> (the actual sysdeps/powerpc/powerpc64/sysdep.h code is inside #ifdef
> __ASSEMBLER__).
>
> This shows some problems with this feature - and with delimited escape
> sequences - as it affects C. It's fine to accept it as an extension
> inside string and character literals, because \N or \u{...} would be
> invalid in the absence of the feature (i.e. the syntax for such literals
> fails to match, meaning that the rule about undefined behavior for a
> single ' or " as a pp-token applies). But outside string and character
> literals, the usual lexing rules apply, the \ is a pp-token on its own and
> the code is valid at the preprocessing level, and with expansion of macros
> appearing before or after the \ (e.g. u defined as a macro in the \u{...}
> case) it may be valid code at the language level as well. I don't know
> what older C++ versions say about this, but for C this means e.g.
>
> #define z(x) 0
> #define a z(
> int x = a\NARG);
>
> needs to be accepted as expanding to "int x = 0;", not interpreted as
> using the \N feature in an identifier and produce an error.
The following patch changes this, so that:
1) outside of string/character literals, \N without following { is never
treated as an error nor warning, it is silently treated as \ separate
token followed by whatever is after it
2) \u{123} and \N{LATIN SMALL LETTER A WITH ACUTE} are not handled as
extension at all outside of string/character literals in the strict
standard modes (-std=c*) except for -std=c++{23,2b}, only in the
-std=gnu* modes, because it changes behavior on valid sources, e.g.
#define z(x) 0
#define a z(
int x = a\u{123});
int y = a\N{LATIN SMALL LETTER A WITH ACUTE});
3) introduces -Wunicode warning (on by default) and warns for cases
of what looks like invalid delimited escape sequence or named
universal character escape outside of string/character literals
and is treated as separate tokens
2022-09-07 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (struct cpp_options): Add cpp_warn_unicode member.
(enum cpp_warning_reason): Add CPP_W_UNICODE.
* init.cc (cpp_create_reader): Initialize cpp_warn_unicode.
* charset.cc (_cpp_valid_ucn): In possible identifier contexts, don't
handle \u{ or \N{ specially in -std=c* modes except -std=c++2{3,b}.
In possible identifier contexts, don't emit an error and punt
if \N isn't followed by {, or if \N{} surrounds some lower case
letters or _. In possible identifier contexts when not C++23, don't
emit an error but warning about unknown character names and treat as
separate tokens. When treating as separate tokens \u{ or \N{, emit
warnings.
gcc/
* doc/invoke.texi (-Wno-unicode): Document.
gcc/c-family/
* c.opt (Winvalid-utf8): Use ObjC instead of objC. Remove
" in comments" from description.
(Wunicode): New option.
gcc/testsuite/
* c-c++-common/cpp/delimited-escape-seq-4.c: New test.
* c-c++-common/cpp/delimited-escape-seq-5.c: New test.
* c-c++-common/cpp/delimited-escape-seq-6.c: New test.
* c-c++-common/cpp/delimited-escape-seq-7.c: New test.
* c-c++-common/cpp/named-universal-char-escape-5.c: New test.
* c-c++-common/cpp/named-universal-char-escape-6.c: New test.
* c-c++-common/cpp/named-universal-char-escape-7.c: New test.
* g++.dg/cpp23/named-universal-char-escape1.C: New test.
* g++.dg/cpp23/named-universal-char-escape2.C: New test.
PR c/90885 notes various places in real-world code where people have
written C/C++ code that uses ^ (exclusive or) where presumbably they
meant exponentiation.
For example
https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=2%5E32&search=Search
currently finds 11 places using "2^32", and all of them appear to be
places where the user means 2 to the power of 32, rather than 2
exclusive-orred with 32 (which is 34).
This patch adds a new -Wxor-used-as-pow warning to the C and C++
frontends to complain about ^ when the left-hand side is the decimal
constant 2 or the decimal constant 10.
This is the same name as the corresponding clang warning:
https://clang.llvm.org/docs/DiagnosticsReference.html#wxor-used-as-pow
As per the clang warning, the warning suggests converting the left-hand
side to a hexadecimal constant if you really mean xor, which suppresses
the warning (though this patch implements a fix-it hint for that, whereas
the clang implementation only has a fix-it hint for the initial
suggestion of exponentiation).
I initially tried implementing this without checking for decimals, but
this version had lots of false positives. Checking for decimals
requires extending the lexer to capture whether or not a CPP_NUMBER
token was decimal. I added a new DECIMAL_INT flag to cpplib.h for this.
Unfortunately, c_token and cp_tokens both have only an unsigned char for
their flags (as captured by c_lex_with_flags), whereas this would add
the 12th flag to cpp_tokens. Of the first 8 flags, all but BOL are used
in the C or C++ frontends, but BOL is not, so I moved that to a higher
position, using its old value for the new DECIMAL_INT flag, so that it
is representable within an unsigned char.
Example output:
demo.c:5:13: warning: result of '2^8' is 10; did you mean '1 << 8' (256)? [-Wxor-used-as-pow]
5 | int t2_8 = 2^8;
| ^
| --
| 1<<
demo.c:5:12: note: you can silence this warning by using a hexadecimal constant (0x2 rather than 2)
5 | int t2_8 = 2^8;
| ^
| 0x2
demo.c:21:15: warning: result of '10^6' is 12; did you mean '1e6'? [-Wxor-used-as-pow]
21 | int t10_6 = 10^6;
| ^
| ---
| 1e
demo.c:21:13: note: you can silence this warning by using a hexadecimal constant (0xa rather than 10)
21 | int t10_6 = 10^6;
| ^~
| 0xa
gcc/c-family/ChangeLog:
PR c/90885
* c-common.h (check_for_xor_used_as_pow): New decl.
* c-lex.cc (c_lex_with_flags): Add DECIMAL_INT to flags as appropriate.
* c-warn.cc (check_for_xor_used_as_pow): New.
* c.opt (Wxor-used-as-pow): New.
gcc/c/ChangeLog:
PR c/90885
* c-parser.cc (c_parser_string_literal): Clear ret.m_decimal.
(c_parser_expr_no_commas): Likewise.
(c_parser_conditional_expression): Likewise.
(c_parser_binary_expression): Clear m_decimal when popping the
stack.
(c_parser_unary_expression): Clear ret.m_decimal.
(c_parser_has_attribute_expression): Likewise for result.
(c_parser_predefined_identifier): Likewise for expr.
(c_parser_postfix_expression): Likewise for expr.
Set expr.m_decimal when handling a CPP_NUMBER that was a decimal
token.
* c-tree.h (c_expr::m_decimal): New bitfield.
* c-typeck.cc (parser_build_binary_op): Clear result.m_decimal.
(parser_build_binary_op): Call check_for_xor_used_as_pow.
gcc/cp/ChangeLog:
PR c/90885
* cp-tree.h (class cp_expr): Add bitfield m_decimal. Clear it in
existing ctors. Add ctor that allows specifying its value.
(cp_expr::decimal_p): New accessor.
* parser.cc (cp_parser_expression_stack_entry::flags): New field.
(cp_parser_primary_expression): Set m_decimal of cp_expr when
handling numbers.
(cp_parser_binary_expression): Extract flags from token when
populating stack. Call check_for_xor_used_as_pow.
gcc/ChangeLog:
PR c/90885
* doc/invoke.texi (Warning Options): Add -Wxor-used-as-pow.
gcc/testsuite/ChangeLog:
PR c/90885
* c-c++-common/Wxor-used-as-pow-1.c: New test.
* c-c++-common/Wxor-used-as-pow-fixits.c: New test.
* g++.dg/parse/expr3.C: Convert 2 to 0x2 to suppress
-Wxor-used-as-pow.
* g++.dg/warn/Wparentheses-10.C: Likewise.
* g++.dg/warn/Wparentheses-18.C: Likewise.
* g++.dg/warn/Wparentheses-19.C: Likewise.
* g++.dg/warn/Wparentheses-9.C: Likewise.
* g++.dg/warn/Wxor-used-as-pow-named-op.C: New test.
* gcc.dg/Wparentheses-6.c: Convert 2 to 0x2 to suppress
-Wxor-used-as-pow.
* gcc.dg/Wparentheses-7.c: Likewise.
* gcc.dg/precedence-1.c: Likewise.
libcpp/ChangeLog:
PR c/90885
* include/cpplib.h (BOL): Move macro to 1 << 12 since it is
not used by C/C++'s unsigned char token flags.
(DECIMAL_INT): New, using 1 << 6, so that it is visible as
part of C/C++'s 8 bits of token flags.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The following patch introduces a new warning - -Winvalid-utf8 similarly
to what clang now has - to diagnose invalid UTF-8 byte sequences in
comments, but not just in those, but also in string/character literals
and outside of them.
The warning is on by default when explicit -finput-charset=UTF-8 is
used and C++23 compilation is requested and if -{,W}pedantic or
-pedantic-errors it is actually a pedwarn.
The reason it is on by default only for -finput-charset=UTF-8 is
that the sources often are UTF-8, but sometimes could be some ASCII
compatible single byte encoding where non-ASCII characters only
appear in comments. So having the warning off by default
is IMO desirable. The C++23 pedantic mode for when the source code
is UTF-8 is -std=c++23 -pedantic-errors -finput-charset=UTF-8.
2022-09-01 Jakub Jelinek <jakub@redhat.com>
PR c++/106655
libcpp/
* include/cpplib.h (struct cpp_options): Implement C++23
P2295R6 - Support for UTF-8 as a portable source file encoding.
Add cpp_warn_invalid_utf8 and cpp_input_charset_explicit fields.
(enum cpp_warning_reason): Add CPP_W_INVALID_UTF8 enumerator.
* init.cc (cpp_create_reader): Initialize cpp_warn_invalid_utf8
and cpp_input_charset_explicit.
* charset.cc (_cpp_valid_utf8): Adjust function comment.
* lex.cc (UCS_LIMIT): Define.
(utf8_continuation): New const variable.
(utf8_signifier): Move earlier in the file.
(_cpp_warn_invalid_utf8, _cpp_handle_multibyte_utf8): New functions.
(_cpp_skip_block_comment): Handle -Winvalid-utf8 warning.
(skip_line_comment): Likewise.
(lex_raw_string, lex_string): Likewise.
(_cpp_lex_direct): Likewise.
gcc/
* doc/invoke.texi (-Winvalid-utf8): Document it.
gcc/c-family/
* c.opt (-Winvalid-utf8): New warning.
* c-opts.cc (c_common_handle_option) <case OPT_finput_charset_>:
Set cpp_opts->cpp_input_charset_explicit.
(c_common_post_options): If -finput-charset=UTF-8 is explicit
in C++23, enable -Winvalid-utf8 by default and if -pedantic
or -pedantic-errors, make it a pedwarn.
gcc/testsuite/
* c-c++-common/cpp/Winvalid-utf8-1.c: New test.
* c-c++-common/cpp/Winvalid-utf8-2.c: New test.
* c-c++-common/cpp/Winvalid-utf8-3.c: New test.
* g++.dg/cpp23/Winvalid-utf8-1.C: New test.
* g++.dg/cpp23/Winvalid-utf8-2.C: New test.
* g++.dg/cpp23/Winvalid-utf8-3.C: New test.
* g++.dg/cpp23/Winvalid-utf8-4.C: New test.
* g++.dg/cpp23/Winvalid-utf8-5.C: New test.
* g++.dg/cpp23/Winvalid-utf8-6.C: New test.
* g++.dg/cpp23/Winvalid-utf8-7.C: New test.
* g++.dg/cpp23/Winvalid-utf8-8.C: New test.
* g++.dg/cpp23/Winvalid-utf8-9.C: New test.
* g++.dg/cpp23/Winvalid-utf8-10.C: New test.
* g++.dg/cpp23/Winvalid-utf8-11.C: New test.
* g++.dg/cpp23/Winvalid-utf8-12.C: New test.
The following patch implements the C++23 P2290R3 paper.
2022-08-20 Jakub Jelinek <jakub@redhat.com>
PR c++/106645
libcpp/
* include/cpplib.h (struct cpp_options): Implement
P2290R3 - Delimited escape sequences. Add delimite_escape_seqs
member.
* init.cc (struct lang_flags): Likewise.
(lang_defaults): Add delim column.
(cpp_set_lang): Copy over delimite_escape_seqs.
* charset.cc (extend_char_range): New function.
(_cpp_valid_ucn): Use it. Handle delimited escape sequences.
(convert_hex): Likewise.
(convert_oct): Likewise.
(convert_ucn): Use extend_char_range.
(convert_escape): Call convert_oct even for \o.
(_cpp_interpret_identifier): Handle delimited escape sequences.
* lex.cc (get_bidi_ucn_1): Likewise. Add end argument, fill it in.
(get_bidi_ucn): Adjust get_bidi_ucn_1 caller. Use end argument to
compute num_bytes.
gcc/testsuite/
* c-c++-common/cpp/delimited-escape-seq-1.c: New test.
* c-c++-common/cpp/delimited-escape-seq-2.c: New test.
* c-c++-common/cpp/delimited-escape-seq-3.c: New test.
* c-c++-common/Wbidi-chars-24.c: New test.
* gcc.dg/cpp/delimited-escape-seq-1.c: New test.
* gcc.dg/cpp/delimited-escape-seq-2.c: New test.
* g++.dg/cpp/delimited-escape-seq-1.C: New test.
* g++.dg/cpp/delimited-escape-seq-2.C: New test.
ISO C2x standardizes the existing #warning extension. Arrange
accordingly for it not to be diagnosed with -std=c2x -pedantic, but to
be diagnosed with -Wc11-c2x-compat.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/testsuite/
* gcc.dg/cpp/c11-warning-1.c, gcc.dg/cpp/c11-warning-2.c,
gcc.dg/cpp/c11-warning-3.c, gcc.dg/cpp/c11-warning-4.c,
gcc.dg/cpp/c2x-warning-1.c, gcc.dg/cpp/c2x-warning-2.c,
gcc.dg/cpp/gnu11-warning-1.c, gcc.dg/cpp/gnu11-warning-2.c,
gcc.dg/cpp/gnu11-warning-3.c, gcc.dg/cpp/gnu11-warning-4.c,
gcc.dg/cpp/gnu2x-warning-1.c, gcc.dg/cpp/gnu2x-warning-2.c: New
tests.
libcpp/
* include/cpplib.h (struct cpp_options): Add warning_directive.
* init.cc (struct lang_flags, lang_defaults): Add
warning_directive.
* directives.cc (DIRECTIVE_TABLE): Mark #warning as STDC2X not
EXTENSION.
(directive_diagnostics): Diagnose #warning with -Wc11-c2x-compat,
or with -pedantic for a standard not supporting #warning.
Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics). This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed. The changes also
implicitly disable the option in C++20 and later modes. These changes
are consistent with the definition of the -Wc++11-compat option.
This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]
Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).
PR c++/106423
gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat
diagnostics in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.
gcc/cp/ChangeLog:
* parser.cc (cp_lexer_saving_tokens): Add comment regarding
diagnostic requirements.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.
libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).
PR preprocessor/106426
gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.
gcc/testsuite/ChangeLog:
* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.
libcpp/ChangeLog:
* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
literals based on unsigned_utf8char.
* include/cpplib.h (cpp_options): Add unsigned_utf8char.
* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
Stephan Bergmann reported that our -Wbidi-chars breaks the build
of LibreOffice because we warn about UCNs even when their usage
is correct: LibreOffice constructs strings piecewise, as in:
aText = u"\u202D" + aText;
and warning about that is overzealous. Since no editor (AFAIK)
interprets UCNs to show them as Unicode characters, there's less
risk in misinterpreting them, and so perhaps we shouldn't warn
about them by default. However, identifiers containing UCNs or
programs generating other programs could still cause confusion,
so I'm keeping the UCN checking. To turn it on, you just need
to use -Wbidi-chars=unpaired,ucn or -Wbidi-chars=any,ucn.
The implementation is done by using the new EnumSet feature.
PR preprocessor/104030
gcc/c-family/ChangeLog:
* c.opt (Wbidi-chars): Mark as EnumSet. Also accept =ucn.
gcc/ChangeLog:
* doc/invoke.texi: Update documentation for -Wbidi-chars.
libcpp/ChangeLog:
* include/cpplib.h (enum cpp_bidirectional_level): Add
bidirectional_ucn. Set values explicitly.
* internal.h (cpp_reader): Adjust warn_bidi_p.
* lex.cc (maybe_warn_bidi_on_close): Don't warn about UCNs
unless UCN checking is on.
(maybe_warn_bidi_on_char): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/Wbidi-chars-10.c: Turn on UCN checking.
* c-c++-common/Wbidi-chars-11.c: Likewise.
* c-c++-common/Wbidi-chars-14.c: Likewise.
* c-c++-common/Wbidi-chars-16.c: Likewise.
* c-c++-common/Wbidi-chars-17.c: Likewise.
* c-c++-common/Wbidi-chars-4.c: Likewise.
* c-c++-common/Wbidi-chars-5.c: Likewise.
* c-c++-common/Wbidi-chars-6.c: Likewise.
* c-c++-common/Wbidi-chars-7.c: Likewise.
* c-c++-common/Wbidi-chars-8.c: Likewise.
* c-c++-common/Wbidi-chars-9.c: Likewise.
* c-c++-common/Wbidi-chars-ranges.c: Likewise.
* c-c++-common/Wbidi-chars-18.c: New test.
* c-c++-common/Wbidi-chars-19.c: New test.
* c-c++-common/Wbidi-chars-20.c: New test.
* c-c++-common/Wbidi-chars-21.c: New test.
* c-c++-common/Wbidi-chars-22.c: New test.
* c-c++-common/Wbidi-chars-23.c: New test.
On AIX, stat will store inodes in 32bit even when using LARGE_FILES.
If the inode is larger, it will return -1 in st_ino.
Thus, in incpath.c when comparing include directories, if several
of them have 64bit inodes, they will be considered as duplicated.
gcc/ChangeLog:
2022-01-12 Clément Chigot <clement.chigot@atos.net>
* configure.ac: Check sizeof ino_t and dev_t.
(HOST_STAT_FOR_64BIT_INODES): New AC_DEFINE to provide stat
syscall being able to handle 64bit inodes.
* config.in: Regenerate.
* configure: Regenerate.
* incpath.c (HOST_STAT_FOR_64BIT_INODES): New define.
(remove_duplicates): Use it.
libcpp/ChangeLog:
2022-01-12 Clément Chigot <clement.chigot@atos.net>
* configure.ac: Check sizeof ino_t and dev_t.
* config.in: Regenerate.
* configure: Regenerate.
* include/cpplib.h (INO_T_CPP): Change for AIX.
(DEV_T_CPP): New macro.
(struct cpp_dir): Use it.
On Mon, Nov 29, 2021 at 05:53:58PM -0500, Jason Merrill wrote:
> I'm inclined to go ahead and change C++98 as well; I doubt anyone is relying
> on the particular C++98 extended character set rules, and we already accept
> the union of the different sets when not pedantic.
Ok, here is an incremental patch to do that also for -std={c,gnu}++98.
2021-12-01 Jakub Jelinek <jakub@redhat.com>
PR c++/100977
* init.c (struct lang_flags): Remove cxx23_identifiers.
(lang_defaults): Remove cxx23_identifiers initializers.
(cpp_set_lang): Don't copy cxx23_identifiers.
* include/cpplib.h (struct cpp_options): Adjust comment about
c11_identifiers. Remove cxx23_identifiers field.
* lex.c (warn_about_normalization): Use cplusplus instead of
cxx23_identifiers.
* charset.c (ucn_valid_in_identifier): Likewise.
* g++.dg/cpp/ucnid-1.C: Adjust expected diagnostics.
* g++.dg/cpp/ucnid-1-utf8.C: Likewise.
From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."
More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574https://trojansource.codes/
This is not a compiler bug. However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.
The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.
This patch handles both UCNs and UTF-8 characters. UCNs designating
bidi characters in identifiers are accepted since r204886. Then r217144
enabled -fextended-identifiers by default. Extended characters in C/C++
identifiers have been accepted since r275979. However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.
We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers. Expectedly, UCNs are ignored
in comments and raw string literals. The bidirectional control characters
can nest so this patch handles that as well.
I have not included nor tested this at all with Fortran (which also has
string literals and line comments).
Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.
PR preprocessor/103026
gcc/c-family/ChangeLog:
* c.opt (Wbidi-chars, Wbidi-chars=): New option.
gcc/ChangeLog:
* doc/invoke.texi: Document -Wbidi-chars.
libcpp/ChangeLog:
* include/cpplib.h (enum cpp_bidirectional_level): New.
(struct cpp_options): Add cpp_warn_bidirectional.
(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
* internal.h (struct cpp_reader): Add warn_bidi_p member
function.
* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
* lex.c (bidi): New namespace.
(get_bidi_utf8): New function.
(get_bidi_ucn): Likewise.
(maybe_warn_bidi_on_close): Likewise.
(maybe_warn_bidi_on_char): Likewise.
(_cpp_skip_block_comment): Implement warning about bidirectional
control characters.
(skip_line_comment): Likewise.
(forms_identifier_p): Likewise.
(lex_identifier): Likewise.
(lex_string): Likewise.
(lex_raw_string): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/Wbidi-chars-1.c: New test.
* c-c++-common/Wbidi-chars-2.c: New test.
* c-c++-common/Wbidi-chars-3.c: New test.
* c-c++-common/Wbidi-chars-4.c: New test.
* c-c++-common/Wbidi-chars-5.c: New test.
* c-c++-common/Wbidi-chars-6.c: New test.
* c-c++-common/Wbidi-chars-7.c: New test.
* c-c++-common/Wbidi-chars-8.c: New test.
* c-c++-common/Wbidi-chars-9.c: New test.
* c-c++-common/Wbidi-chars-10.c: New test.
* c-c++-common/Wbidi-chars-11.c: New test.
* c-c++-common/Wbidi-chars-12.c: New test.
* c-c++-common/Wbidi-chars-13.c: New test.
* c-c++-common/Wbidi-chars-14.c: New test.
* c-c++-common/Wbidi-chars-15.c: New test.
* c-c++-common/Wbidi-chars-16.c: New test.
* c-c++-common/Wbidi-chars-17.c: New test.
This patch adds support to GCC's diagnostic subsystem for escaping certain
bytes and Unicode characters when quoting source code.
Specifically, this patch adds a new flag rich_location::m_escape_on_output
which is a hint from a diagnostic that non-ASCII bytes in the pertinent
lines of the user's source code should be escaped when printed.
The patch sets this for the following diagnostics:
- when complaining about stray bytes in the program (when these
are non-printable)
- when complaining about "null character(s) ignored");
- for -Wnormalized= (and generate source ranges for such warnings)
The escaping is controlled by a new option:
-fdiagnostics-escape-format=[unicode|bytes]
For example, consider a diagnostic involing a source line containing the
string "before" followed by the Unicode character U+03C0 ("GREEK SMALL
LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF
(a stray UTF-8 trailing byte), followed by the string "after", where the
diagnostic highlights the U+03C0 character.
By default, this line will be printed verbatim to the user when
reporting a diagnostic at it, as:
beforeπXafter
^
(using X for the stray byte to avoid putting invalid UTF-8 in this
commit message)
If the diagnostic sets the "escape" flag, it will be printed as:
before<U+03C0><BF>after
^~~~~~~~
with -fdiagnostics-escape-format=unicode (the default), or as:
before<CF><80><BF>after
^~~~~~~~
if the user supplies -fdiagnostics-escape-format=bytes.
This only affects how the source is printed; it does not affect
how column numbers that are printed (as per -fdiagnostics-column-unit=
and -fdiagnostics-column-origin=).
gcc/c-family/ChangeLog:
* c-lex.c (c_lex_with_flags): When complaining about non-printable
CPP_OTHER tokens, set the "escape on output" flag.
gcc/ChangeLog:
* common.opt (fdiagnostics-escape-format=): New.
(diagnostics_escape_format): New enum.
(DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value.
(DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise.
* diagnostic-format-json.cc (json_end_diagnostic): Add
"escape-source" attribute.
* diagnostic-show-locus.c
(exploc_with_display_col::exploc_with_display_col): Replace
"tabstop" param with a cpp_char_column_policy and add an "aspect"
param. Use these to compute m_display_col accordingly.
(struct char_display_policy): New struct.
(layout::m_policy): New field.
(layout::m_escape_on_output): New field.
(def_policy): New function.
(make_range): Update for changes to exploc_with_display_col ctor.
(default_print_decoded_ch): New.
(width_per_escaped_byte): New.
(escape_as_bytes_width): New.
(escape_as_bytes_print): New.
(escape_as_unicode_width): New.
(escape_as_unicode_print): New.
(make_policy): New.
(layout::layout): Initialize new fields. Update m_exploc ctor
call for above change to ctor.
(layout::maybe_add_location_range): Update for changes to
exploc_with_display_col ctor.
(layout::calculate_x_offset_display): Update for change to
cpp_display_width.
(layout::print_source_line): Pass policy
to cpp_display_width_computation. Capture cpp_decoded_char when
calling process_next_codepoint. Move printing of source code to
m_policy.m_print_cb.
(line_label::line_label): Pass in policy rather than context.
(layout::print_any_labels): Update for change to line_label ctor.
(get_affected_range): Pass in policy rather than context, updating
calls to location_compute_display_column accordingly.
(get_printed_columns): Likewise, also for cpp_display_width.
(correction::correction): Pass in policy rather than tabstop.
(correction::compute_display_cols): Pass m_policy rather than
m_tabstop to cpp_display_width.
(correction::m_tabstop): Replace with...
(correction::m_policy): ...this.
(line_corrections::line_corrections): Pass in policy rather than
context.
(line_corrections::m_context): Replace with...
(line_corrections::m_policy): ...this.
(line_corrections::add_hint): Update to use m_policy rather than
m_context.
(line_corrections::add_hint): Likewise.
(layout::print_trailing_fixits): Likewise.
(selftest::test_display_widths): New.
(selftest::test_layout_x_offset_display_utf8): Update to use
policy rather than tabstop.
(selftest::test_one_liner_labels_utf8): Add test of escaping
source lines.
(selftest::test_diagnostic_show_locus_one_liner_utf8): Update to
use policy rather than tabstop.
(selftest::test_overlapped_fixit_printing): Likewise.
(selftest::test_overlapped_fixit_printing_utf8): Likewise.
(selftest::test_overlapped_fixit_printing_2): Likewise.
(selftest::test_tab_expansion): Likewise.
(selftest::test_escaping_bytes_1): New.
(selftest::test_escaping_bytes_2): New.
(selftest::diagnostic_show_locus_c_tests): Call the new tests.
* diagnostic.c (diagnostic_initialize): Initialize
context->escape_format.
(convert_column_unit): Update to use default character width policy.
(selftest::test_diagnostic_get_location_text): Likewise.
* diagnostic.h (enum diagnostics_escape_format): New enum.
(diagnostic_context::escape_format): New field.
* doc/invoke.texi (-fdiagnostics-escape-format=): New option.
(-fdiagnostics-format=): Add "escape-source" attribute to examples
of JSON output, and document it.
* input.c (location_compute_display_column): Pass in "policy"
rather than "tabstop", passing to
cpp_byte_column_to_display_column.
(selftest::test_cpp_utf8): Update to use cpp_char_column_policy.
* input.h (class cpp_char_column_policy): New forward decl.
(location_compute_display_column): Pass in "policy" rather than
"tabstop".
* opts.c (common_handle_option): Handle
OPT_fdiagnostics_escape_format_.
* selftest.c (temp_source_file::temp_source_file): New ctor
overload taking a size_t.
* selftest.h (temp_source_file::temp_source_file): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-json-1.c: Add regexp to consume
"escape-source" attribute.
* c-c++-common/diagnostic-format-json-2.c: Likewise.
* c-c++-common/diagnostic-format-json-3.c: Likewise.
* c-c++-common/diagnostic-format-json-4.c: Likewise, twice.
* c-c++-common/diagnostic-format-json-5.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-bytes.c: New test.
* gcc.dg/cpp/warn-normalized-4-unicode.c: New test.
* gcc.dg/encoding-issues-bytes.c: New test.
* gcc.dg/encoding-issues-unicode.c: New test.
* gfortran.dg/diagnostic-format-json-1.F90: Add regexp to consume
"escape-source" attribute.
* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
libcpp/ChangeLog:
* charset.c (convert_escape): Use encoding_rich_location when
complaining about nonprintable unknown escape sequences.
(cpp_display_width_computation::::cpp_display_width_computation):
Pass in policy rather than tabstop.
(cpp_display_width_computation::process_next_codepoint): Add "out"
param and populate *out if non-NULL.
(cpp_display_width_computation::advance_display_cols): Pass NULL
to process_next_codepoint.
(cpp_byte_column_to_display_column): Pass in policy rather than
tabstop. Pass NULL to process_next_codepoint.
(cpp_display_column_to_byte_column): Pass in policy rather than
tabstop.
* errors.c (cpp_diagnostic_get_current_location): New function,
splitting out the logic from...
(cpp_diagnostic): ...here.
(cpp_warning_at): New function.
(cpp_pedwarning_at): New function.
* include/cpplib.h (cpp_warning_at): New decl for rich_location.
(cpp_pedwarning_at): Likewise.
(struct cpp_decoded_char): New.
(struct cpp_char_column_policy): New.
(cpp_display_width_computation::cpp_display_width_computation):
Replace "tabstop" param with "policy".
(cpp_display_width_computation::process_next_codepoint): Add "out"
param.
(cpp_display_width_computation::m_tabstop): Replace with...
(cpp_display_width_computation::m_policy): ...this.
(cpp_byte_column_to_display_column): Replace "tabstop" param with
"policy".
(cpp_display_width): Likewise.
(cpp_display_column_to_byte_column): Likewise.
* include/line-map.h (rich_location::escape_on_output_p): New.
(rich_location::set_escape_on_output): New.
(rich_location::m_escape_on_output): New.
* internal.h (cpp_diagnostic_get_current_location): New decl.
(class encoding_rich_location): New.
* lex.c (skip_whitespace): Use encoding_rich_location when
complaining about null characters.
(warn_about_normalization): Generate a source range when
complaining about improperly normalized tokens, rather than just a
point, and use encoding_rich_location so that the source code
is escaped on printing.
* line-map.c (rich_location::rich_location): Initialize
m_escape_on_output.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Both #pragma and _Pragma ended up as CPP_PRAGMA. Presumably since
r131819 (2008, GCC 4.3) for PR34692, pragmas are not expanded in
macro arguments but are output as is before. From the old bug report,
that was to fix usage like
FOO (
#pragma GCC diagnostic
)
However, that change also affected _Pragma such that
BAR (
"1";
_Pragma("omp ..."); )
yielded
#pragma omp ...
followed by what BAR expanded too, possibly including '"1";'.
This commit adds a flag, PRAGMA_OP, to tokens to make the two
distinguishable - and include again _Pragma in the expanded arguments.
libcpp/ChangeLog:
PR c++/102409
* directives.c (destringize_and_run): Add PRAGMA_OP to the
CPP_PRAGMA token's flags to mark is as coming from _Pragma.
* include/cpplib.h (PRAGMA_OP): #define, to be used with token flags.
* macro.c (collect_args): Only handle CPP_PRAGMA special if PRAGMA_OP
is set.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/pragma-1.c: New test.
* c-c++-common/gomp/pragma-2.c: New test.
The following patch implements the
P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
paper. We already allow UTF-8 characters in the source, so that part
is already implemented, so IMHO all we need to do is pedwarn instead of
just warn for the (default) -Wnormalize=nfc (or for -Wnormalize={id,nkfc})
if the character is not in NFC and to use the unicode XID_Start and
XID_Continue derived code properties to find out what characters are allowed
(the standard actually adds U+005F to XID_Start, but we are handling the
ASCII compatible characters differently already and they aren't allowed
in UCNs in identifiers). Instead of hardcoding the large tables
in ucnid.tab, this patch makes makeucnid.c read them from the Unicode
tables (13.0.0 version at this point).
For non-pedantic mode, we accept as 2nd+ char in identifiers a union
of valid characters in all supported modes, but for the 1st char it
was actually pedantically requiring that it is not any of the characters
that may not appear in the currently chosen standard as the first character.
This patch changes it such that also what is allowed at the start of an
identifier is a union of characters valid at the start of an identifier
in any of the pedantic modes.
2021-09-01 Jakub Jelinek <jakub@redhat.com>
PR c++/100977
libcpp/
* include/cpplib.h (struct cpp_options): Add cxx23_identifiers.
* charset.c (CXX23, NXX23): New enumerators.
(CID, NFC, NKC, CTX): Renumber.
(ucn_valid_in_identifier): Implement P1949R7 - use CXX23 and
NXX23 flags for cxx23_identifiers. For start character in
non-pedantic mode, allow characters that are allowed as start
characters in any of the supported language modes, rather than
disallowing characters allowed only as non-start characters in
current mode but for characters from other language modes allowing
them even if they are never allowed at start.
* init.c (struct lang_flags): Add cxx23_identifiers.
(lang_defaults): Add cxx23_identifiers column.
(cpp_set_lang): Initialize CPP_OPTION (pfile, cxx23_identifiers).
* lex.c (warn_about_normalization): If cxx23_identifiers, use
cpp_pedwarning_with_line instead of cpp_warning_with_line for
"is not in NFC" diagnostics.
* makeucnid.c: Adjust usage comment.
(CXX23, NXX23): New enumerators.
(all_languages): Add CXX23.
(not_NFC, not_NFKC, maybe_not_NFC): Renumber.
(read_derivedcore): New function.
(write_table): Print also CXX23 and NXX23 columns.
(main): Require 5 arguments instead of 4, call read_derivedcore.
* ucnid.h: Regenerated using Unicode 13.0.0 files.
gcc/testsuite/
* g++.dg/cpp23/normalize1.C: New test.
* g++.dg/cpp23/normalize2.C: New test.
* g++.dg/cpp23/normalize3.C: New test.
* g++.dg/cpp23/normalize4.C: New test.
* g++.dg/cpp23/normalize5.C: New test.
* g++.dg/cpp23/normalize6.C: New test.
* g++.dg/cpp23/normalize7.C: New test.
* g++.dg/cpp23/ucnid-1-utf8.C: New test.
* g++.dg/cpp23/ucnid-2-utf8.C: New test.
* gcc.dg/cpp/ucnid-4.c: Don't expect
"not valid at the start of an identifier" errors.
* gcc.dg/cpp/ucnid-4-utf8.c: Likewise.
* gcc.dg/cpp/ucnid-5-utf8.c: New test.
Adds the logic to handle -finput-charset in layout_get_source_line(), so that
source lines are converted from their input encodings prior to being output by
diagnostics machinery. Also adds the ability to strip a UTF-8 BOM similarly.
gcc/c-family/ChangeLog:
PR other/93067
* c-opts.c (c_common_input_charset_cb): New function.
(c_common_post_options): Call new function
diagnostic_initialize_input_context().
gcc/d/ChangeLog:
PR other/93067
* d-lang.cc (d_input_charset_callback): New function.
(d_init): Call new function
diagnostic_initialize_input_context().
gcc/fortran/ChangeLog:
PR other/93067
* cpp.c (gfc_cpp_post_options): Call new function
diagnostic_initialize_input_context().
gcc/ChangeLog:
PR other/93067
* coretypes.h (typedef diagnostic_input_charset_callback): Declare.
* diagnostic.c (diagnostic_initialize_input_context): New function.
* diagnostic.h (diagnostic_initialize_input_context): Declare.
* input.c (default_charset_callback): New function.
(file_cache::initialize_input_context): New function.
(file_cache_slot::create): Added ability to convert the input
according to the input context.
(file_cache::file_cache): Initialize the new input context.
(class file_cache_slot): Added new m_alloc_offset member.
(file_cache_slot::file_cache_slot): Initialize the new member.
(file_cache_slot::~file_cache_slot): Handle potentially offset buffer.
(file_cache_slot::maybe_grow): Likewise.
(file_cache_slot::needs_read_p): Handle NULL fp, which is now possible.
(file_cache_slot::get_next_line): Likewise.
* input.h (class file_cache): Added input context member.
libcpp/ChangeLog:
PR other/93067
* charset.c (init_iconv_desc): Adapt to permit PFILE argument to
be NULL.
(_cpp_convert_input): Likewise. Also move UTF-8 BOM logic to...
(cpp_check_utf8_bom): ...here. New function.
(cpp_input_conversion_is_trivial): New function.
* files.c (read_file_guts): Allow PFILE argument to be NULL. Add
INPUT_CHARSET argument as an alternate source of this information.
(read_file): Pass the new argument to read_file_guts.
(cpp_get_converted_source): New function.
* include/cpplib.h (struct cpp_converted_source): Declare.
(cpp_get_converted_source): Declare.
(cpp_input_conversion_is_trivial): Declare.
(cpp_check_utf8_bom): Declare.
gcc/testsuite/ChangeLog:
PR other/93067
* gcc.dg/diagnostic-input-charset-1.c: New test.
* gcc.dg/diagnostic-input-utf8-bom.c: New test.
The toolchain provided by ST for stm32 has had support for
__FILENAME__ for a while, but clang/llvm has recently implemented
support for __FILE_NAME__, so it seems better to use the same macro
name in GCC.
It happens that the ST patch is similar to the one proposed in PR
c/42579.
Given these input files:
::::::::::::::
mydir/myinc.h
::::::::::::::
char* mystringh_file = __FILE__;
char* mystringh_filename = __FILE_NAME__;
char* mystringh_base_file = __BASE_FILE__;
::::::::::::::
mydir/mysrc.c
::::::::::::::
char* mystring_file = __FILE__;
char* mystring_filename = __FILE_NAME__;
char* mystring_base_file = __BASE_FILE__;
we produce:
$ gcc mydir/mysrc.c -I . -E
char* mystringh_file = "./mydir/myinc.h";
char* mystringh_filename = "myinc.h";
char* mystringh_base_file = "mydir/mysrc.c";
char* mystring_file = "mydir/mysrc.c";
char* mystring_filename = "mysrc.c";
char* mystring_base_file = "mydir/mysrc.c";
2021-05-20 Christophe Lyon <christophe.lyon@linaro.org>
Torbjörn Svensson <torbjorn.svensson@st.com>
PR c/42579
libcpp/
* include/cpplib.h (cpp_builtin_type): Add BT_FILE_NAME entry.
* init.c (builtin_array): Likewise.
* macro.c (_cpp_builtin_macro_text): Add support for BT_FILE_NAME.
gcc/
* doc/cpp.texi (Common Predefined Macros): Document __FILE_NAME__.
gcc/testsuite/
* c-c++-common/spellcheck-reserved.c: Add tests for __FILE_NAME__.
* c-c++-common/cpp/file-name-1.c: New test.
C2X adds #elifdef and #elifndef preprocessor directives; these have
also been proposed for C++. Implement these directives in libcpp
accordingly.
In this implementation, #elifdef and #elifndef are treated as
non-directives for any language version other than c2x and gnu2x (if
the feature is accepted for C++, it can trivially be enabled for
relevant C++ versions). In strict conformance modes for prior
language versions, this is required, as illustrated by the
c11-elifdef-1.c test added.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
libcpp/
* include/cpplib.h (struct cpp_options): Add elifdef.
* init.c (struct lang_flags): Add elifdef.
(lang_defaults): Update to include elifdef initializers.
(cpp_set_lang): Set elifdef for pfile based on language.
* directives.c (STDC2X, ELIFDEF): New macros.
(EXTENSION): Increase value to 3.
(DIRECTIVE_TABLE): Add #elifdef and #elifndef.
(_cpp_handle_directive): Do not treat ELIFDEF directives as
directives for language versions without the #elifdef feature.
(do_elif): Handle #elifdef and #elifndef.
(do_elifdef, do_elifndef): New functions.
gcc/testsuite/
* gcc.dg/cpp/c11-elifdef-1.c, gcc.dg/cpp/c2x-elifdef-1.c,
gcc.dg/cpp/c2x-elifdef-2.c: New tests.
This defect really required building header-units and include translation
of pieces of the standard library. This adds smarts to the modules
test harness to do that -- accept .X files as the source file, but
provide '-x c++-system-header $HDR' in the options. The .X file will
be considered by the driver to be a linker script and ignored (with a
warning).
Using this we can add 2 tests that end up building list_initializer
and iostream, along with a test that iostream's build
include-translates list_initializer's #include. That discovered a set
of issues with the -flang-info-include-translate=HDR handling, also
fixed and documented here.
PR c++/99023
gcc/cp/
* module.cc (canonicalize_header_name): Use
cpp_probe_header_unit.
(maybe_translate_include): Fix note_includes comparison.
(init_modules): Fix note_includes string termination.
libcpp/
* include/cpplib.h (cpp_find_header_unit): Rename to ...
(cpp_probe_header_unit): ... this.
* internal.h (_cp_find_header_unit): Declare.
* files.c (cpp_find_header_unit): Break apart to ..
(test_header_unit): ... this, and ...
(_cpp_find_header_unit): ... and, or and ...
(cpp_probe_header_unit): ... this.
* macro.c (cpp_get_token_1): Call _cpp_find_header_unit.
gcc/
* doc/invoke.texi (flang-info-include-translate): Document header
lookup behaviour.
gcc/testsuite/
* g++.dg/modules/modules.exp: Bail on cross-testing. Add support
for .X files.
* g++.dg/modules/pr99023_a.X: New.
* g++.dg/modules/pr99023_b.X: New.
Integer literal suffixes for signed size ('z') and unsigned size
(some permutation od 'zu') are provided as a language addition.
gcc/c-family/ChangeLog:
* c-cppbuiltin.c (c_cpp_builtins): Define __cpp_size_t_suffix.
* c-lex.c (interpret_integer): Set node type for size literal.
libcpp/ChangeLog:
* expr.c (interpret_int_suffix): Detect 'z' integer suffix.
(cpp_classify_number): Compat warning for use of 'z' suffix.
* include/cpplib.h (struct cpp_options): New flag.
(enum cpp_warning_reason): New flag.
(CPP_N_USERDEF): Comment C++0x -> C++11.
(CPP_N_SIZE_T): New flag for cpp_classify_number.
* init.c (cpp_set_lang): Initialize new flag.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/udlit-shadow-neg.C: Test for 'z' and 'zu' shadowing.
* g++.dg/cpp23/feat-cxx2b.C: New test.
* g++.dg/cpp23/size_t-literals.C: New test.
* g++.dg/warn/Wsize_t-literals.C: New test.
Derived from the changes that added C++2a support in 2017.
r8-3237-g026a79f70cf33f836ea5275eda72d4870a3041e5
No C++23 features are added here.
Use of -std=c++23 sets __cplusplus to 202100L.
$ g++ -std=c++23 -dM -E -x c++ - < /dev/null | grep cplusplus
#define __cplusplus 202100L
gcc/
* doc/cpp.texi (__cplusplus): Document value for -std=c++23
or -std=gnu++23.
* doc/invoke.texi: Document -std=c++23 and -std=gnu++23.
* dwarf2out.c (highest_c_language): Recognise C++20 and C++23.
(gen_compile_unit_die): Recognise C++23.
gcc/c-family/
* c-common.h (cxx_dialect): Add cxx23 as a dialect.
* c.opt: Add options for -std=c++23, std=c++2b, -std=gnu++23
and -std=gnu++2b
* c-opts.c (set_std_cxx23): New.
(c_common_handle_option): Set options when -std=c++23 is enabled.
(c_common_post_options): Adjust comments.
(set_std_cxx20): Likewise.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_c++2a):
Check for C++2a or C++23.
(check_effective_target_c++20_down): New.
(check_effective_target_c++23_only): New.
(check_effective_target_c++23): New.
* g++.dg/cpp23/cplusplus.C: New.
libcpp/
* include/cpplib.h (c_lang): Add CXX23 and GNUCXX23.
* init.c (lang_defaults): Add rows for CXX23 and GNUCXX23.
(cpp_init_builtins): Set __cplusplus to 202100L for C++23.
For deferred macros we also need a new field on the macro itself, so
that the module machinery can determine the macro was imported. Also
the documentation for the hashnode's deferred field was incomplete.
libcpp/
* include/cpplib.h (struct cpp_macro): Add imported_p field.
(struct cpp_hashnode): Tweak deferred field documentation.
* macro.c (_cpp_new_macro): Clear new field.
(cpp_get_deferred_macro, get_deferred_or_lazy_macro): Assert
more.
Deferred macros are needed for C++ modules. Header units may export
macro definitions and undefinitions. These are resolved lazily at the
point of (potential) use. (The language specifies that, it's not just
a useful optimization.) Thus, identifier nodes grow a 'deferred'
field, which fortunately doesn't expand the structure on 64-bit
systems as there was padding there. This is non-zero on NT_MACRO
nodes, if the macro is deferred. When such an identifier is lexed, it
is resolved via a callback that I added recently. That will either
provide the macro definition, or discover it there was an overriding
undef. Either way the identifier is no longer a deferred macro.
Notice it is now possible for NT_MACRO nodes to have a NULL macro
expansion.
libcpp/
* include/cpplib.h (struct cpp_hashnode): Add deferred field.
(cpp_set_deferred_macro): Define.
(cpp_get_deferred_macro): Declare.
(cpp_macro_definition): Reformat, add overload.
(cpp_macro_definition_location): Deal with deferred macro.
(cpp_alloc_token_string, cpp_compare_macro): Declare.
* internal.h (_cpp_notify_macro_use): Return bool
(_cpp_maybe_notify_macro_use): Likewise.
* directives.c (do_undef): Check macro is not undef before
warning.
(do_ifdef, do_ifndef): Deal with deferred macro.
* expr.c (parse_defined): Likewise.
* lex.c (cpp_allocate_token_string): Break out of ...
(create_literal): ... here. Call it.
(cpp_maybe_module_directive): Deal with deferred macro.
* macro.c (cpp_get_token_1): Deal with deferred macro.
(warn_of_redefinition): Deal with deferred macro.
(compare_macros): Rename to ...
(cpp_compare_macro): ... here. Make extern.
(cpp_get_deferred_macro): New.
(_cpp_notify_macro_use): Deal with deferred macro, return bool
indicating definedness.
(cpp_macro_definition): Deal with deferred macro.
This adds the capability to locate the main file on the user or system
include paths. That's extremely useful to users building header
units. Searching has to be requiested (plain header-unit compilation
will not search). Also, to make include_next work as expected when
building a header unit, we add a mechanism to retrofit a non-searched
source file as one on the include path.
libcpp/
* include/cpplib.h (enum cpp_main_search): New.
(struct cpp_options): Add main_search field.
(cpp_main_loc): Declare.
(cpp_retrofit_as_include): Declare.
* internal.h (struct cpp_reader): Add main_loc field.
(_cpp_in_main_source_file): Not main if main is a header.
* init.c (cpp_read_main_file): Use main_search option to locate
main file. Set main_loc
* files.c (cpp_retrofit_as_include): New.
C++20 modules introduces a new kind of preprocessor directive -- a
module directive. These are directives but without the leading '#'.
We have to detect them by sniffing the start of a logical line. When
detected we replace the initial identifiers with unspellable tokens
and pass them through to the language parser the same way deferred
pragmas are. There's a PRAGMA_EOL at the logical end of line too.
One additional complication is that we have to do header-name lexing
after the initial tokens, and that requires changes in the macro-aware
piece of the preprocessor. The above sniffer sets a counter in the
lexer state, and that triggers at the appropriate point. We then do
the same header-name lexing that occurs on a #include directive or
has_include pseudo-macro. Except that the header name ends up in the
token stream.
A couple of token emitters need to deal with the new token possibility.
gcc/c-family/
* c-lex.c (c_lex_with_flags): CPP_HEADER_NAMEs can now be seen.
libcpp/
* include/cpplib.h (struct cpp_options): Add module_directives
option.
(NODE_MODULE): New node flag.
(struct cpp_hashnode): Make rid-code a bitfield, increase bits in
flags and swap with type field.
* init.c (post_options): Create module-directive identifier nodes.
* internal.h (struct lexer_state): Add directive_file_token &
n_modules fields. Add module node enumerator.
* lex.c (cpp_maybe_module_directive): New.
(_cpp_lex_token): Call it.
(cpp_output_token): Add '"' around CPP_HEADER_NAME token.
(do_peek_ident, do_peek_module): New.
(cpp_directives_only): Detect module-directive lines.
* macro.c (cpp_get_token_1): Deal with directive_file_token
triggering.
This is slightly different to the original patch I posted. This adds
separate module target and dependency functions (rather than a single
bi-modal function).
libcpp/
* include/cpplib.h (struct cpp_options): Add modules to
dep-options.
* include/mkdeps.h (deps_add_module_target): Declare.
(deps_add_module_dep): Declare.
* mkdeps.c (class mkdeps): Add modules, module_name, cmi_name,
is_header_unit fields. Adjust cdtors.
(deps_add_module_target, deps_add_module_dep): New.
(make_write): Write module dependencies, if enabled.
These two callbacks are needed for C++ modules. The first is for
handling macros from header-units. These are resolved lazily. The
second is for include-translation -- whether a #include gets turned
into a header-unit import.
libcpp/
* include/cpplib.h (struct cpp_callbacks): Add
user_deferred_macro & translate_include.
C2x adds the __has_c_attribute preprocessor operator, similar to C++
__has_cpp_attribute.
GCC implements __has_cpp_attribute as exactly equivalent to
__has_attribute. (The documentation says they differ regarding the
values returned for standard attributes, but that's actually only a
matter of the particular nonzero value returned not being specified in
the documentation for __has_attribute; the implementation makes no
distinction between the two.)
I don't think having them exactly equivalent is actually correct,
either for __has_cpp_attribute or for __has_c_attribute.
Specifically, I think it is only correct for __has_cpp_attribute or
__has_c_attribute to return nonzero if the given attribute is
supported, with the particular pp-tokens passed to __has_cpp_attribute
or __has_c_attribute, with [[]] syntax, not if it's only accepted in
__attribute__ or with gnu:: added in [[]]. For example, they should
return nonzero for gnu::packed, but zero for plain packed, because
[[gnu::packed]] is accepted but [[packed]] is ignored as not a
standard attribute.
This patch implements that for __has_c_attribute, leaving any changes
to __has_cpp_attribute for the C++ maintainers. A new
BT_HAS_STD_ATTRIBUTE is added for __has_c_attribute (which I think,
based on the above, would actually be correct to use for
__has_cpp_attribute as well). The code in c_common_has_attribute that
deals with scopes has its C++ conditional removed; instead, whether
the language is C or C++ is used only to determine the numeric values
returned for standard attributes (and which standard attributes are
handled there at all). A new argument is passed to
c_common_has_attribute to distinguish BT_HAS_STD_ATTRIBUTE from
BT_HAS_ATTRIBUTE, and that argument is used to stop attributes with no
scope specified from being accepted with __has_c_attribute unless they
are one of the known standard attributes and so handled specially.
Although the standard specify constants ending with 'L' as the values
for the standard attributes, there is no correctness issue with the
lack of code in GCC to add that 'L' to the expansion:
__has_c_attribute and __has_cpp_attribute are expanded in #if after
other macro expansion has occurred, with no semantics being specified
if they occur outside #if, so there is no way for a conforming program
to inspect the exact text of the expansion of those macros, only to
use the resulting pp-number in a #if expression, where long and int
have the same set of values.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/
2020-11-12 Joseph Myers <joseph@codesourcery.com>
* doc/cpp.texi (__has_attribute): Document when scopes are allowed
for C.
(__has_c_attribute): New.
gcc/c-family/
2020-11-12 Joseph Myers <joseph@codesourcery.com>
* c-lex.c (c_common_has_attribute): Take argument std_syntax.
Allow scope for C. Handle standard attributes for C. Do not
accept unscoped attributes if std_syntax and not handled as
standard attributes.
* c-common.h (c_common_has_attribute): Update prototype.
gcc/testsuite/
2020-11-12 Joseph Myers <joseph@codesourcery.com>
* gcc.dg/c2x-has-c-attribute-1.c, gcc.dg/c2x-has-c-attribute-2.c,
gcc.dg/c2x-has-c-attribute-3.c, gcc.dg/c2x-has-c-attribute-4.c:
New tests.
libcpp/
2020-11-12 Joseph Myers <joseph@codesourcery.com>
* include/cpplib.h (struct cpp_callbacks): Add bool argument to
has_attribute.
(enum cpp_builtin_type): Add BT_HAS_STD_ATTRIBUTE.
* init.c (builtin_array): Add __has_c_attribute.
(cpp_init_special_builtins): Handle BT_HAS_STD_ATTRIBUTE.
* macro.c (_cpp_builtin_macro_text): Handle BT_HAS_STD_ATTRIBUTE.
Update call to has_attribute for BT_HAS_ATTRIBUTE.
* traditional.c (fun_like_macro): Handle BT_HAS_STD_ATTRIBUTE.
Joseph pointed me at cb_get_source_date_epoch, which allows repeatable
builds and solves a FIXME I had on the modules branch. Unfortunately
it's used exclusively to generate __DATE__ and __TIME__ values, which
fallback to using a time(2) call. It'd be nicer if the preprocessor
made whatever time value it determined available to the rest of the
compiler. So this patch adds a new cpp_get_date function, which
abstracts the call to the get_source_date_epoch hook, or uses time
directly. The value is cached. Thus the timestamp I end up putting
on CMI files matches __DATE__ and __TIME__ expansions. That seems
worthwhile.
libcpp/
* include/cpplib.h (enum class CPP_time_kind): New.
(cpp_get_date): Declare.
* internal.h (struct cpp_reader): Replace source_date_epoch with
time_stamp and time_stamp_kind.
* init.c (cpp_create_reader): Initialize them.
* macro.c (_cpp_builtin_macro_text): Use cpp_get_date.
(cpp_get_date): Broken out from _cpp_builtin_macro_text and
genericized.
Supports conversion of tabs to spaces when outputting diagnostics. Also
adds -fdiagnostics-column-unit and -fdiagnostics-column-origin options to
control how the column number is output, thereby resolving the two PRs.
gcc/c-family/ChangeLog:
PR other/86904
* c-indentation.c (should_warn_for_misleading_indentation): Get
global tabstop from the new source.
* c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which
is now a common option.
* c.opt: Likewise.
gcc/ChangeLog:
PR preprocessor/49973
PR other/86904
* common.opt: Handle -ftabstop here instead of in c-family
options. Add -fdiagnostics-column-unit= and
-fdiagnostics-column-origin= options.
* opts.c (common_handle_option): Handle the new options.
* diagnostic-format-json.cc (json_from_expanded_location): Add
diagnostic_context argument. Use it to convert column numbers as per
the new options.
(json_from_location_range): Likewise.
(json_from_fixit_hint): Likewise.
(json_end_diagnostic): Pass the new context argument to helper
functions above. Add "column-origin" field to the output.
(test_unknown_location): Add the new context argument to calls to
helper functions.
(test_bad_endpoints): Likewise.
* diagnostic-show-locus.c
(exploc_with_display_col::exploc_with_display_col): Support
tabstop parameter.
(layout_point::layout_point): Make use of class
exploc_with_display_col.
(layout_range::layout_range): Likewise.
(struct line_bounds): Clarify that the units are now always
display columns. Rename members accordingly. Add constructor.
(layout::print_source_line): Add support for tab expansion.
(make_range): Adapt to class layout_range changes.
(layout::maybe_add_location_range): Likewise.
(layout::layout): Adapt to class exploc_with_display_col changes.
(layout::calculate_x_offset_display): Support tabstop parameter.
(layout::print_annotation_line): Adapt to struct line_bounds changes.
(layout::print_line): Likewise.
(line_label::line_label): Add diagnostic_context argument.
(get_affected_range): Likewise.
(get_printed_columns): Likewise.
(layout::print_any_labels): Adapt to struct line_label changes.
(class correction): Add m_tabstop member.
(correction::correction): Add tabstop argument.
(correction::compute_display_cols): Use m_tabstop.
(class line_corrections): Add m_context member.
(line_corrections::line_corrections): Add diagnostic_context argument.
(line_corrections::add_hint): Use m_context to handle tabstops.
(layout::print_trailing_fixits): Adapt to class line_corrections
changes.
(test_layout_x_offset_display_utf8): Support tabstop parameter.
(test_layout_x_offset_display_tab): New selftest.
(test_one_liner_colorized_utf8): Likewise.
(test_tab_expansion): Likewise.
(test_diagnostic_show_locus_one_liner_utf8): Call the new tests.
(diagnostic_show_locus_c_tests): Likewise.
(test_overlapped_fixit_printing): Adapt to helper class and
function changes.
(test_overlapped_fixit_printing_utf8): Likewise.
(test_overlapped_fixit_printing_2): Likewise.
* diagnostic.h (enum diagnostics_column_unit): New enum.
(struct diagnostic_context): Add members for the new options.
(diagnostic_converted_column): Declare.
(json_from_expanded_location): Add new context argument.
* diagnostic.c (diagnostic_initialize): Initialize new members.
(diagnostic_converted_column): New function.
(maybe_line_and_column): Be willing to output a column of 0.
(diagnostic_get_location_text): Convert column number as per the new
options.
(diagnostic_report_current_module): Likewise.
(assert_location_text): Add origin and column_unit arguments for
testing the new functionality.
(test_diagnostic_get_location_text): Test the new functionality.
* doc/invoke.texi: Document the new options and behavior.
* input.h (location_compute_display_column): Add tabstop argument.
* input.c (location_compute_display_column): Likewise.
(test_cpp_utf8): Add selftests for tab expansion.
* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the
new context argument to json_from_expanded_location().
libcpp/ChangeLog:
PR preprocessor/49973
PR other/86904
* include/cpplib.h (struct cpp_options): Removed support for -ftabstop,
which is now handled by diagnostic_context.
(class cpp_display_width_computation): New class.
(cpp_byte_column_to_display_column): Add optional tabstop argument.
(cpp_display_width): Likewise.
(cpp_display_column_to_byte_column): Likewise.
* charset.c
(cpp_display_width_computation::cpp_display_width_computation): New
function.
(cpp_display_width_computation::advance_display_cols): Likewise.
(compute_next_display_width): Removed and implemented this
functionality in a new function...
(cpp_display_width_computation::process_next_codepoint): ...here.
(cpp_byte_column_to_display_column): Added tabstop argument.
Reimplemented in terms of class cpp_display_width_computation.
(cpp_display_column_to_byte_column): Likewise.
* init.c (cpp_create_reader): Remove handling of -ftabstop, which is now
handled by diagnostic_context.
gcc/testsuite/ChangeLog:
PR preprocessor/49973
PR other/86904
* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output
for new defaults.
* c-c++-common/Wmisleading-indentation.c: Likewise.
* c-c++-common/diagnostic-format-json-1.c: Likewise.
* c-c++-common/diagnostic-format-json-2.c: Likewise.
* c-c++-common/diagnostic-format-json-3.c: Likewise.
* c-c++-common/diagnostic-format-json-4.c: Likewise.
* c-c++-common/diagnostic-format-json-5.c: Likewise.
* c-c++-common/missing-close-symbol.c: Likewise.
* g++.dg/diagnostic/bad-binary-ops.C: Likewise.
* g++.dg/parse/error4.C: Likewise.
* g++.old-deja/g++.brendan/crash11.C: Likewise.
* g++.old-deja/g++.pt/overload2.C: Likewise.
* g++.old-deja/g++.robertl/eb109.C: Likewise.
* gcc.dg/analyzer/malloc-paths-9.c: Likewise.
* gcc.dg/bad-binary-ops.c: Likewise.
* gcc.dg/format/branch-1.c: Likewise.
* gcc.dg/format/pr79210.c: Likewise.
* gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.
* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.
* gcc.dg/redecl-4.c: Likewise.
* gfortran.dg/diagnostic-format-json-1.F90: Likewise.
* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
* go.dg/arrayclear.go: Add a comment explaining why adding a
comment was necessary to work around a dejagnu bug.
* c-c++-common/diagnostic-units-1.c: New test.
* c-c++-common/diagnostic-units-2.c: New test.
* c-c++-common/diagnostic-units-3.c: New test.
* c-c++-common/diagnostic-units-4.c: New test.
* c-c++-common/diagnostic-units-5.c: New test.
* c-c++-common/diagnostic-units-6.c: New test.
* c-c++-common/diagnostic-units-7.c: New test.
* c-c++-common/diagnostic-units-8.c: New test.
With C++ module header units it becomes important to distinguish
between macros defined in forced headers (& commandline & builtins)
from those defined in the header file being processed. We weren't
making that easy because we treated the builtins and command-line
locations somewhat file-like, with incrementing line numbers, and
showing them as included from line 1 of the main file. This patch does
3 things:
0) extend the idiom that 'line 0' of a file means 'the file as a whole'
1) builtins and command-line macros are shown as-if included from line zero.
2) when emitting preprocessed output we keep resetting the line number
so that re-reading that preprocessed output will get the same set of
locations for the command line etc.
For instance the new c-c++-common/cpp/line-2.c test, now emits
In file included from <command-line>:
./line-2.h:4:2: error: #error wrong
4 | #error wrong
| ^~~~~
line-2.c:3:11: error: macro "bill" passed 1 arguments, but takes just 0
3 | int bill(1);
| ^
In file included from <command-line>:
./line-2.h:3: note: macro "bill" defined here
3 | #define bill() 2
|
Before it told you about including from <command-line>:31.
the preprocessed output looks like:
...
(There's a new optimization in do_line_marker to stop each of these
line markers causing a new line map. We can simply rewind the
location, and keep using the same line map.)
libcpp/
* directives.c (do_linemarker): Optimize rewinding to line zero.
* files.c (_cpp_stack_file): Start on line zero when about to inject
headers.
(cpp_push_include, cpp_push_default_include): Use highest_line as
the location.
* include/cpplib.h (cpp_read_main_file): Add injecting parm.
* init.c (cpp_read_main_file): Likewise, inform _cpp_stack_file.
* internal.h (enum include_type): Add IT_MAIN_INJECT.
gcc/c-family/
* c-opts.c (c_common_post_options): Add 'injecting' arg to
cpp_read_main_file.
(c_finish_options): Add linemap_line_start calls for builtin and cmd
maps. Force token position to line_table's highest line.
* c-ppoutput.c (print_line_1): Refactor, print line zero.
(cb_define): Always increment source line.
gcc/testsuite/
* c-c++-common/cpp/line-2.c: New.
* c-c++-common/cpp/line-2.h: New.
* c-c++-common/cpp/line-3.c: New.
* c-c++-common/cpp/line-4.c: New.
* c-c++-common/cpp/line-4.h: New.
This fixes a bunch of poorly formatted decls, marks some getters as
PURE, deletes some C-relevant bool hackery, and finally uses a
passed-in location rather than deducing a closely-related but not
necessarily the same location.
* include/cpplib.h (cpp_get_otions, cpp_get_callbacks)
(cpp_get_deps): Mark as PURE.
* include/line-map.h (get_combined_adhoc_loc)
(get_location_from_adhoc_loc, get_pure_location): Reformat decls.
* internal.h (struct lexer_state): Clarify comment.
* system.h: Remove now-unneeded bool hackery.
* files.c (_cpp_find_file): Store LOC not highest_location.
C++20 isn't final quite yet, but all that remains is formalities, so let's
go ahead and change all the references.
I think for the next C++ standard we can just call it C++23 rather than
C++2b, since the committee has been consistent about time-based releases
rather than feature-based.
gcc/c-family/ChangeLog
2020-05-13 Jason Merrill <jason@redhat.com>
* c.opt (std=c++20): Make c++2a the alias.
(std=gnu++20): Likewise.
* c-common.h (cxx_dialect): Change cxx2a to cxx20.
* c-opts.c: Adjust.
* c-cppbuiltin.c: Adjust.
* c-ubsan.c: Adjust.
* c-warn.c: Adjust.
gcc/cp/ChangeLog
2020-05-13 Jason Merrill <jason@redhat.com>
* call.c, class.c, constexpr.c, constraint.cc, decl.c, init.c,
lambda.c, lex.c, method.c, name-lookup.c, parser.c, pt.c, tree.c,
typeck2.c: Change cxx2a to cxx20.
libcpp/ChangeLog
2020-05-13 Jason Merrill <jason@redhat.com>
* include/cpplib.h (enum c_lang): Change CXX2A to CXX20.
* init.c, lex.c: Adjust.
The existing directives-only code (a) punched a hole through the
libcpp interface and (b) didn't support raw string literals. This
reimplements this preprocessing mode. I added a proper callback
interface, and adjusted c-ppoutput to use it. Sadly I cannot get rid
of the libcpp/internal.h include for unrelated reasons.
The new scanner is in lex.x, and works doing some backwards scanning
when it finds a charater of interest. This reduces the number of
cases one has to deal with in forward scanning. It may have different
failure mode than forward scanning on bad tokenization.
Finally, Moved some cpp tests from the c-specific dg.gcc/cpp directory
to the c-c++-common/cpp shared directory,
libcpp/
* directives-only.c: Delete.
* Makefile.in (libcpp_a_OBJS, libcpp_a_SOURCES): Remove it.
* include/cpplib.h (enum CPP_DO_task): New enum.
(cpp_directive_only_preprocess): Declare.
* internal.h (_cpp_dir_only_callbacks): Delete.
(_cpp_preprocess_dir_only): Delete.
* lex.c (do_peek_backslask, do_peek_next, do_peek_prev): New.
(cpp_directives_only_process): New implementation.
gcc/c-family/
Reimplement directives only processing.
* c-ppoutput.c (token_streamer): Ne.
(directives_only_cb): New. Swallow ...
(print_lines_directives_only): ... this.
(scan_translation_unit_directives_only): Reimplment using the
published interface.
gcc/testsuite/
* gcc.dg/cpp/counter-[23].c: Move to c-c+_-common/cpp.
* gcc.dg/cpp/dir-only-*: Likewise.
* c-c++-common/cpp/dir-only-[78].c: New.