diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4)

This patch adds support to our SARIF output for cases where
rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars.

In such cases, the pertinent SARIF "location" object gains a property
bag with property "gcc/escapeNonAscii": true, and the "artifactContent"
within the location's physical location's snippet" gains a "rendered"
property (§3.3.4) that escapes non-ASCII text in the snippet, such as:

"rendered": {"text":

where "text" has a string value such as (for a "trojan source" attack):

  "9 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n"
  "  |       ~~~~~~~~                                ~~~~~~~~                    ^\n"
  "  |       |                                       |                           |\n"
  "  |       |                                       |                           end of bidirectional context\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)\n"

where the escaping is affected by -fdiagnostics-escape-format=; with
-fdiagnostics-escape-format=bytes, the rendered text of the above is:

  "9 |     /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n"
  "  |       ~~~~~~~~~~~~                                        ~~~~~~~~~~~~                    ^\n"
  "  |       |                                                   |                               |\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)                     U+2066 (LEFT-TO-RIGHT ISOLATE)  end of bidirectional context\n"

The patch also refactors/adds enough selftest machinery to be able to
test the snippet generation from within the selftest framework, rather
than just within DejaGnu (where the regex-based testing isn't
sophisticated enough to verify such properties as the above).

gcc/ChangeLog:
	* Makefile.in (OBJS-libcommon): Add selftest-json.o.
	* diagnostic-format-sarif.cc: Include "selftest.h",
	"selftest-diagnostic.h", "selftest-diagnostic-show-locus.h",
	"selftest-json.h", and "text-range-label.h".
	(class content_renderer): New.
	(sarif_builder::m_rules_arr): Convert to std::unique_ptr.
	(sarif_builder::make_location_object): Add class
	escape_nonascii_renderer.  If rich_loc.escape_on_output_p (),
	pass a nonnull escape_nonascii_renderer to
	maybe_make_physical_location_object as its snippet_renderer, and
	add a property bag property "gcc/escapeNonAscii" to the SARIF
	location object.  For other overloads of make_location_object,
	pass nullptr for the snippet_renderer.
	(sarif_builder::maybe_make_region_object_for_context): Add
	"snippet_renderer" param and pass it to
	maybe_make_artifact_content_object.
	(sarif_builder::make_tool_object): Drop "const".
	(sarif_builder::make_driver_tool_component_object): Likewise.
	Use typesafe unique_ptr variant of object::set for setting "rules"
	property on driver_obj.
	(sarif_builder::maybe_make_artifact_content_object): Add param "r"
	and use it to potentially set the "rendered" property (§3.3.4).
	(selftest::test_make_location_object): New.
	(selftest::diagnostic_format_sarif_cc_tests): New.
	* diagnostic-show-locus.cc: Include "text-range-label.h" and
	"selftest-diagnostic-show-locus.h".
	(selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture):
	New.
	(selftests::test_layout_x_offset_display_utf8): Use
	diagnostic_show_locus_fixture to simplify and consolidate setup
	code.
	(selftests::test_diagnostic_show_locus_one_liner): Likewise.
	(selftests::test_one_liner_colorized_utf8): Likewise.
	(selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise.
	* gcc-rich-location.h (class text_range_label): Move to new file
	text-range-label.h.
	* selftest-diagnostic-show-locus.h: New file, based on material in
	diagnostic-show-locus.cc.
	* selftest-json.cc: New file.
	* selftest-json.h: New file.
	* selftest-run-tests.cc (selftest::run_tests): Call
	selftest::diagnostic_format_sarif_cc_tests.
	* selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl.

gcc/testsuite/ChangeLog:
	* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify
	that we have a property bag with property "gcc/escapeNonAscii": true.
	Verify that we have a "rendered" property for a snippet.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include
	"text-range-label.h".

gcc/ChangeLog:
	* text-range-label.h: New file, taking class text_range_label from
	gcc-rich-location.h.

libcpp/ChangeLog:
	* include/rich-location.h
	(semi_embedded_vec::semi_embedded_vec): Add copy ctor.
	(rich_location::rich_location): Remove "= delete" from decl of
	copy ctor.  Add deleted decl of move ctor.
	(rich_location::operator=): Remove "= delete" from decl of
	copy assignment.  Add deleted decl of move assignment.
	(fixit_hint::fixit_hint): Add copy ctor decl.  Add deleted decl of
	move.
	(fixit_hint::operator=): Add copy assignment decl.  Add deleted
	decl of move assignment.
	* line-map.cc (rich_location::rich_location): New copy ctor.
	(fixit_hint::fixit_hint): New copy ctor.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This commit is contained in:
David Malcolm 2024-07-24 18:07:54 -04:00
parent ae4f4f767e
commit 148066bd05
14 changed files with 669 additions and 90 deletions

View file

@ -1832,6 +1832,7 @@ OBJS-libcommon = diagnostic-spec.o diagnostic.o diagnostic-color.o \
vec.o input.o hash-table.o ggc-none.o memory-block.o \
selftest.o selftest-diagnostic.o sort.o \
selftest-diagnostic-path.o \
selftest-json.o \
selftest-logical-location.o \
text-art/box-drawing.o \
text-art/canvas.o \

View file

@ -37,9 +37,16 @@ along with GCC; see the file COPYING3. If not see
#include "ordered-hash-map.h"
#include "sbitmap.h"
#include "make-unique.h"
#include "selftest.h"
#include "selftest-diagnostic.h"
#include "selftest-diagnostic-show-locus.h"
#include "selftest-json.h"
#include "text-range-label.h"
/* Forward decls. */
class sarif_builder;
class content_renderer;
class escape_nonascii_renderer;
/* Subclasses of sarif_object.
Keep these in order of their descriptions in the specification. */
@ -284,6 +291,20 @@ public:
sarif_builder &builder);
};
/* Abstract base class for use when making an "artifactContent"
object (SARIF v2.1.0 section 3.3): generate a value for the
3.3.4 "rendered" property.
Can return nullptr, for "no property". */
class content_renderer
{
public:
virtual ~content_renderer () {}
virtual std::unique_ptr<sarif_multiformat_message_string>
render (const sarif_builder &builder) const = 0;
};
/* A class for managing SARIF output (for -fdiagnostics-format=sarif-stderr
and -fdiagnostics-format=sarif-file).
@ -312,7 +333,6 @@ public:
property (SARIF v2.1.0 section 3.14.11), as invocation objects
(SARIF v2.1.0 section 3.20), but we'd want to capture the arguments to
toplev::main, and the response files.
- doesn't capture escape_on_output_p
- doesn't capture secondary locations within a rich_location
(perhaps we should use the "relatedLocations" property: SARIF v2.1.0
section 3.27.22)
@ -379,7 +399,8 @@ private:
std::unique_ptr<sarif_physical_location>
maybe_make_physical_location_object (location_t loc,
enum diagnostic_artifact_role role,
int column_override);
int column_override,
const content_renderer *snippet_renderer);
std::unique_ptr<sarif_artifact_location>
make_artifact_location_object (location_t loc);
std::unique_ptr<sarif_artifact_location>
@ -390,7 +411,8 @@ private:
maybe_make_region_object (location_t loc,
int column_override) const;
std::unique_ptr<sarif_region>
maybe_make_region_object_for_context (location_t loc) const;
maybe_make_region_object_for_context (location_t loc,
const content_renderer *snippet_renderer) const;
std::unique_ptr<sarif_region>
make_region_object_for_hint (const fixit_hint &hint) const;
std::unique_ptr<sarif_multiformat_message_string>
@ -402,9 +424,9 @@ private:
make_run_object (std::unique_ptr<sarif_invocation> invocation_obj,
std::unique_ptr<json::array> results);
std::unique_ptr<sarif_tool>
make_tool_object () const;
make_tool_object ();
std::unique_ptr<sarif_tool_component>
make_driver_tool_component_object () const;
make_driver_tool_component_object ();
std::unique_ptr<json::array> maybe_make_taxonomies_array () const;
std::unique_ptr<sarif_tool_component>
maybe_make_cwe_taxonomy_object () const;
@ -430,7 +452,8 @@ private:
std::unique_ptr<sarif_artifact_content>
maybe_make_artifact_content_object (const char *filename,
int start_line,
int end_line) const;
int end_line,
const content_renderer *r) const;
std::unique_ptr<sarif_fix>
make_fix_object (const rich_location &rich_loc);
std::unique_ptr<sarif_artifact_change>
@ -460,7 +483,7 @@ private:
bool m_seen_any_relative_paths;
hash_set <free_string_hash> m_rule_id_set;
json::array *m_rules_arr;
std::unique_ptr<json::array> m_rules_arr;
/* The set of all CWE IDs we've seen, if any. */
hash_set <int_hash <int, 0, 1> > m_cwe_id_set;
@ -1086,21 +1109,74 @@ sarif_builder::make_location_object (const rich_location &rich_loc,
const logical_location *logical_loc,
enum diagnostic_artifact_role role)
{
class escape_nonascii_renderer : public content_renderer
{
public:
escape_nonascii_renderer (const rich_location &richloc,
enum diagnostics_escape_format escape_format)
: m_richloc (richloc),
m_escape_format (escape_format)
{}
std::unique_ptr<sarif_multiformat_message_string>
render (const sarif_builder &builder) const final override
{
diagnostic_context dc;
diagnostic_initialize (&dc, 0);
dc.m_source_printing.enabled = true;
dc.m_source_printing.colorize_source_p = false;
dc.m_source_printing.show_labels_p = true;
dc.m_source_printing.show_line_numbers_p = true;
rich_location my_rich_loc (m_richloc);
my_rich_loc.set_escape_on_output (true);
dc.set_escape_format (m_escape_format);
diagnostic_show_locus (&dc, &my_rich_loc, DK_ERROR);
std::unique_ptr<sarif_multiformat_message_string> result
= builder.make_multiformat_message_string
(pp_formatted_text (dc.printer));
diagnostic_finish (&dc);
return result;
}
private:
const rich_location &m_richloc;
enum diagnostics_escape_format m_escape_format;
} the_renderer (rich_loc,
m_context.get_escape_format ());
auto location_obj = ::make_unique<sarif_location> ();
/* Get primary loc from RICH_LOC. */
location_t loc = rich_loc.get_loc ();
/* "physicalLocation" property (SARIF v2.1.0 section 3.28.3). */
const content_renderer *snippet_renderer
= rich_loc.escape_on_output_p () ? &the_renderer : nullptr;
if (auto phs_loc_obj
= maybe_make_physical_location_object (loc, role,
rich_loc.get_column_override ()))
rich_loc.get_column_override (),
snippet_renderer))
location_obj->set<sarif_physical_location> ("physicalLocation",
std::move (phs_loc_obj));
/* "logicalLocations" property (SARIF v2.1.0 section 3.28.4). */
set_any_logical_locs_arr (*location_obj, logical_loc);
/* A flag for hinting that the diagnostic involves issues at the
level of character encodings (such as homoglyphs, or misleading
bidirectional control codes), and thus that it will be helpful
to the user if we show some representation of
how the characters in the pertinent source lines are encoded. */
if (rich_loc.escape_on_output_p ())
{
sarif_property_bag &bag = location_obj->get_or_create_properties ();
bag.set_bool ("gcc/escapeNonAscii", rich_loc.escape_on_output_p ());
}
return location_obj;
}
@ -1115,7 +1191,8 @@ sarif_builder::make_location_object (const diagnostic_event &event,
/* "physicalLocation" property (SARIF v2.1.0 section 3.28.3). */
location_t loc = event.get_location ();
if (auto phs_loc_obj = maybe_make_physical_location_object (loc, role, 0))
if (auto phs_loc_obj
= maybe_make_physical_location_object (loc, role, 0, nullptr))
location_obj->set<sarif_physical_location> ("physicalLocation",
std::move (phs_loc_obj));
@ -1144,7 +1221,8 @@ std::unique_ptr<sarif_physical_location>
sarif_builder::
maybe_make_physical_location_object (location_t loc,
enum diagnostic_artifact_role role,
int column_override)
int column_override,
const content_renderer *snippet_renderer)
{
if (loc <= BUILTINS_LOCATION || LOCATION_FILE (loc) == nullptr)
return nullptr;
@ -1161,7 +1239,8 @@ maybe_make_physical_location_object (location_t loc,
phys_loc_obj->set<sarif_region> ("region", std::move (region_obj));
/* "contextRegion" property (SARIF v2.1.0 section 3.29.5). */
if (auto context_region_obj = maybe_make_region_object_for_context (loc))
if (auto context_region_obj
= maybe_make_region_object_for_context (loc, snippet_renderer))
phys_loc_obj->set<sarif_region> ("contextRegion",
std::move (context_region_obj));
@ -1339,7 +1418,10 @@ sarif_builder::maybe_make_region_object (location_t loc,
the pertinent source. */
std::unique_ptr<sarif_region>
sarif_builder::maybe_make_region_object_for_context (location_t loc) const
sarif_builder::
maybe_make_region_object_for_context (location_t loc,
const content_renderer *snippet_renderer)
const
{
location_t caret_loc = get_pure_location (loc);
@ -1373,7 +1455,8 @@ sarif_builder::maybe_make_region_object_for_context (location_t loc) const
if (auto artifact_content_obj
= maybe_make_artifact_content_object (exploc_start.file,
exploc_start.line,
exploc_finish.line))
exploc_finish.line,
snippet_renderer))
region_obj->set<sarif_artifact_content> ("snippet",
std::move (artifact_content_obj));
@ -1716,7 +1799,7 @@ make_run_object (std::unique_ptr<sarif_invocation> invocation_obj,
/* Make a "tool" object (SARIF v2.1.0 section 3.18). */
std::unique_ptr<sarif_tool>
sarif_builder::make_tool_object () const
sarif_builder::make_tool_object ()
{
auto tool_obj = ::make_unique<sarif_tool> ();
@ -1777,7 +1860,7 @@ sarif_builder::make_tool_object () const
calls the "driver" (see SARIF v2.1.0 section 3.18.1). */
std::unique_ptr<sarif_tool_component>
sarif_builder::make_driver_tool_component_object () const
sarif_builder::make_driver_tool_component_object ()
{
auto driver_obj = ::make_unique<sarif_tool_component> ();
@ -1809,7 +1892,7 @@ sarif_builder::make_driver_tool_component_object () const
}
/* "rules" property (SARIF v2.1.0 section 3.19.23). */
driver_obj->set ("rules", m_rules_arr);
driver_obj->set<json::array> ("rules", std::move (m_rules_arr));
return driver_obj;
}
@ -1971,12 +2054,16 @@ sarif_builder::get_source_lines (const char *filename,
}
/* Make an "artifactContent" object (SARIF v2.1.0 section 3.3) for the given
run of lines within FILENAME (including the endpoints). */
run of lines within FILENAME (including the endpoints).
If R is non-NULL, use it to potentially set the "rendered"
property (3.3.4). */
std::unique_ptr<sarif_artifact_content>
sarif_builder::maybe_make_artifact_content_object (const char *filename,
int start_line,
int end_line) const
sarif_builder::
maybe_make_artifact_content_object (const char *filename,
int start_line,
int end_line,
const content_renderer *r) const
{
char *text_utf8 = get_source_lines (filename, start_line, end_line);
@ -1994,6 +2081,12 @@ sarif_builder::maybe_make_artifact_content_object (const char *filename,
artifact_content_obj->set_string ("text", text_utf8);
free (text_utf8);
/* 3.3.4 "rendered" property. */
if (r)
if (std::unique_ptr<sarif_multiformat_message_string> rendered
= r->render (*this))
artifact_content_obj->set ("rendered", std::move (rendered));
return artifact_content_obj;
}
@ -2260,3 +2353,99 @@ diagnostic_output_format_init_sarif_stream (diagnostic_context &context,
formatted,
stream));
}
#if CHECKING_P
namespace selftest {
static void
test_make_location_object (const line_table_case &case_)
{
diagnostic_show_locus_fixture_one_liner_utf8 f (case_);
location_t line_end = linemap_position_for_column (line_table, 31);
/* Don't attempt to run the tests if column data might be unavailable. */
if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
return;
test_diagnostic_context dc;
sarif_builder builder (dc, "MAIN_INPUT_FILENAME", true);
const location_t foo
= make_location (linemap_position_for_column (line_table, 1),
linemap_position_for_column (line_table, 1),
linemap_position_for_column (line_table, 8));
const location_t bar
= make_location (linemap_position_for_column (line_table, 12),
linemap_position_for_column (line_table, 12),
linemap_position_for_column (line_table, 17));
const location_t field
= make_location (linemap_position_for_column (line_table, 19),
linemap_position_for_column (line_table, 19),
linemap_position_for_column (line_table, 30));
text_range_label label0 ("label0");
text_range_label label1 ("label1");
text_range_label label2 ("label2");
rich_location richloc (line_table, foo, &label0, nullptr);
richloc.add_range (bar, SHOW_RANGE_WITHOUT_CARET, &label1);
richloc.add_range (field, SHOW_RANGE_WITHOUT_CARET, &label2);
richloc.set_escape_on_output (true);
std::unique_ptr<sarif_location> location_obj
= builder.make_location_object
(richloc, nullptr, diagnostic_artifact_role::analysis_target);
ASSERT_NE (location_obj, nullptr);
auto physical_location
= EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (location_obj.get (),
"physicalLocation");
{
auto region
= EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (physical_location, "region");
ASSERT_JSON_INT_PROPERTY_EQ (region, "startLine", 1);
ASSERT_JSON_INT_PROPERTY_EQ (region, "startColumn", 1);
ASSERT_JSON_INT_PROPERTY_EQ (region, "endColumn", 7);
}
{
auto context_region
= EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (physical_location,
"contextRegion");
ASSERT_JSON_INT_PROPERTY_EQ (context_region, "startLine", 1);
{
auto snippet
= EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (context_region, "snippet");
/* We expect the snippet's "text" to be a copy of the content. */
ASSERT_JSON_STRING_PROPERTY_EQ (snippet, "text", f.m_content);
/* We expect the snippet to have a "rendered" whose "text" has a
pure ASCII escaped copy of the line (with labels, etc). */
{
auto rendered
= EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (snippet, "rendered");
ASSERT_JSON_STRING_PROPERTY_EQ
(rendered, "text",
"1 | <U+1F602>_foo = <U+03C0>_bar.<U+1F602>_field<U+03C0>;\n"
" | ^~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~\n"
" | | | |\n"
" | label0 label1 label2\n");
}
}
}
}
/* Run all of the selftests within this file. */
void
diagnostic_format_sarif_cc_tests ()
{
for_each_line_table_case (test_make_location_object);
}
} // namespace selftest
#endif /* CHECKING_P */

View file

@ -29,8 +29,10 @@ along with GCC; see the file COPYING3. If not see
#include "diagnostic.h"
#include "diagnostic-color.h"
#include "gcc-rich-location.h"
#include "text-range-label.h"
#include "selftest.h"
#include "selftest-diagnostic.h"
#include "selftest-diagnostic-show-locus.h"
#include "cpplib.h"
#include "text-art/types.h"
#include "text-art/theme.h"
@ -3291,6 +3293,18 @@ namespace selftest {
/* Selftests for diagnostic_show_locus. */
diagnostic_show_locus_fixture::
diagnostic_show_locus_fixture (const line_table_case &case_,
const char *content)
: m_content (content),
m_tmp_source_file (SELFTEST_LOCATION, ".c", content),
m_ltt (case_),
m_fc ()
{
linemap_add (line_table, LC_ENTER, false,
m_tmp_source_file.get_filename (), 1);
}
/* Verify that cpp_display_width correctly handles escaping. */
static void
@ -3395,11 +3409,9 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
no multibyte characters earlier on the line. */
const int emoji_col = 102;
temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
file_cache fc;
line_table_test ltt (case_);
diagnostic_show_locus_fixture f (case_, content);
linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
linemap_add (line_table, LC_ENTER, false, f.get_filename (), 1);
location_t line_end = linemap_position_for_column (line_table, line_bytes);
@ -3407,16 +3419,16 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
return;
ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
ASSERT_STREQ (f.get_filename (), LOCATION_FILE (line_end));
ASSERT_EQ (1, LOCATION_LINE (line_end));
ASSERT_EQ (line_bytes, LOCATION_COLUMN (line_end));
char_span lspan = fc.get_source_line (tmp.get_filename (), 1);
char_span lspan = f.m_fc.get_source_line (f.get_filename (), 1);
ASSERT_EQ (line_display_cols,
cpp_display_width (lspan.get_buffer (), lspan.length (),
def_policy ()));
ASSERT_EQ (line_display_cols,
location_compute_display_column (fc,
location_compute_display_column (f.m_fc,
expand_location (line_end),
def_policy ()));
ASSERT_EQ (0, memcmp (lspan.get_buffer () + (emoji_col - 1),
@ -4215,10 +4227,8 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_)
....................0000000001111111.
....................1234567890123456. */
const char *content = "foo = bar.field;\n";
temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
line_table_test ltt (case_);
linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
diagnostic_show_locus_fixture f (case_, content);
location_t line_end = linemap_position_for_column (line_table, 16);
@ -4226,7 +4236,7 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_)
if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
return;
ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
ASSERT_STREQ (f.get_filename (), LOCATION_FILE (line_end));
ASSERT_EQ (1, LOCATION_LINE (line_end));
ASSERT_EQ (16, LOCATION_COLUMN (line_end));
@ -4246,27 +4256,17 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_)
test_one_liner_labels ();
}
/* Version of all one-liner tests exercising multibyte awareness. For
simplicity we stick to using two multibyte characters in the test, U+1F602
== "\xf0\x9f\x98\x82", which uses 4 bytes and 2 display columns, and U+03C0
== "\xcf\x80", which uses 2 bytes and 1 display column. Note: all of the
below asserts would be easier to read if we used UTF-8 directly in the
string constants, but it seems better not to demand the host compiler
support this, when it isn't otherwise necessary. Instead, whenever an
extended character appears in a string, we put a line break after it so that
all succeeding characters can appear visually at the correct display column.
/* Version of all one-liner tests exercising multibyte awareness.
These are all called from test_diagnostic_show_locus_one_liner,
which uses diagnostic_show_locus_fixture_one_liner_utf8 to create
the test file; see the notes in diagnostic-show-locus-selftest.h.
All of these work on the following 1-line source file:
.0000000001111111111222222 display
.1234567890123456789012345 columns
"SS_foo = P_bar.SS_fieldP;\n"
.0000000111111111222222223 byte
.1356789012456789134567891 columns
which is set up by test_diagnostic_show_locus_one_liner and calls
them. Here SS represents the two display columns for the U+1F602 emoji and
P represents the one display column for the U+03C0 pi symbol. */
Note: all of the below asserts would be easier to read if we used UTF-8
directly in the string constants, but it seems better not to demand the
host compiler support this, when it isn't otherwise necessary. Instead,
whenever an extended character appears in a string, we put a line break
after it so that all succeeding characters can appear visually at the
correct display column. */
/* Just a caret. */
@ -4784,25 +4784,27 @@ test_one_liner_colorized_utf8 ()
ASSERT_STR_CONTAINS (first_pi + 2, "\xcf\x80");
}
static const char * const one_liner_utf8_content
/* Display columns.
0000000000000000000000011111111111111111111111111111112222222222222
1111111122222222345678900000000123456666666677777777890123444444445 */
= "\xf0\x9f\x98\x82_foo = \xcf\x80_bar.\xf0\x9f\x98\x82_field\xcf\x80;\n";
/* 0000000000000000000001111111111111111111222222222222222222222233333
1111222233334444567890122223333456789999000011112222345678999900001
Byte columns. */
diagnostic_show_locus_fixture_one_liner_utf8::
diagnostic_show_locus_fixture_one_liner_utf8 (const line_table_case &case_)
: diagnostic_show_locus_fixture (case_, one_liner_utf8_content)
{
}
/* Run the various one-liner tests. */
static void
test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
{
/* Create a tempfile and write some text to it. */
const char *content
/* Display columns.
0000000000000000000000011111111111111111111111111111112222222222222
1111111122222222345678900000000123456666666677777777890123444444445 */
= "\xf0\x9f\x98\x82_foo = \xcf\x80_bar.\xf0\x9f\x98\x82_field\xcf\x80;\n";
/* 0000000000000000000001111111111111111111222222222222222222222233333
1111222233334444567890122223333456789999000011112222345678999900001
Byte columns. */
temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
file_cache fc;
line_table_test ltt (case_);
linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
diagnostic_show_locus_fixture_one_liner_utf8 f (case_);
location_t line_end = linemap_position_for_column (line_table, 31);
@ -4810,14 +4812,14 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
return;
ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
ASSERT_STREQ (f.get_filename (), LOCATION_FILE (line_end));
ASSERT_EQ (1, LOCATION_LINE (line_end));
ASSERT_EQ (31, LOCATION_COLUMN (line_end));
char_span lspan = fc.get_source_line (tmp.get_filename (), 1);
char_span lspan = f.m_fc.get_source_line (f.get_filename (), 1);
ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length (),
def_policy ()));
ASSERT_EQ (25, location_compute_display_column (fc,
ASSERT_EQ (25, location_compute_display_column (f.m_fc,
expand_location (line_end),
def_policy ()));

View file

@ -121,21 +121,4 @@ class gcc_rich_location : public rich_location
location_t indent);
};
/* Concrete subclass of libcpp's range_label.
Simple implementation using a string literal. */
class text_range_label : public range_label
{
public:
text_range_label (const char *text) : m_text (text) {}
label_text get_text (unsigned /*range_idx*/) const final override
{
return label_text::borrow (m_text);
}
private:
const char *m_text;
};
#endif /* GCC_RICH_LOCATION_H */

View file

@ -0,0 +1,82 @@
/* Support for selftests involving diagnostic_show_locus.
Copyright (C) 1999-2024 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
#ifndef GCC_SELFTEST_DIAGNOSTIC_SHOW_LOCUS_H
#define GCC_SELFTEST_DIAGNOSTIC_SHOW_LOCUS_H
#include "selftest.h"
/* The selftest code should entirely disappear in a production
configuration, hence we guard all of it with #if CHECKING_P. */
#if CHECKING_P
namespace selftest {
/* RAII class for use in selftests involving diagnostic_show_locus.
Manages creating and cleaning up the following:
- writing out a temporary .c file containing CONTENT
- temporarily override the global "line_table" (using CASE_) and
push a line_map starting at the first line of the temporary file
- provide a file_cache. */
struct diagnostic_show_locus_fixture
{
diagnostic_show_locus_fixture (const line_table_case &case_,
const char *content);
const char *get_filename () const
{
return m_tmp_source_file.get_filename ();
}
const char *m_content;
temp_source_file m_tmp_source_file;
line_table_test m_ltt;
file_cache m_fc;
};
/* Fixture for one-liner tests exercising multibyte awareness. For
simplicity we stick to using two multibyte characters in the test, U+1F602
== "\xf0\x9f\x98\x82", which uses 4 bytes and 2 display columns, and U+03C0
== "\xcf\x80", which uses 2 bytes and 1 display column.
This works with the following 1-line source file:
.0000000001111111111222222 display
.1234567890123456789012345 columns
"SS_foo = P_bar.SS_fieldP;\n"
.0000000111111111222222223 byte
.1356789012456789134567891 columns
Here SS represents the two display columns for the U+1F602 emoji and
P represents the one display column for the U+03C0 pi symbol. */
struct diagnostic_show_locus_fixture_one_liner_utf8
: public diagnostic_show_locus_fixture
{
diagnostic_show_locus_fixture_one_liner_utf8 (const line_table_case &case_);
};
} // namespace selftest
#endif /* #if CHECKING_P */
#endif /* GCC_SELFTEST_DIAGNOSTIC_SHOW_LOCUS_H */

119
gcc/selftest-json.cc Normal file
View file

@ -0,0 +1,119 @@
/* Selftest support for JSON.
Copyright (C) 2024 Free Software Foundation, Inc.
Contributed by David Malcolm <dmalcolm@redhat.com>.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
#include "config.h"
#define INCLUDE_MEMORY
#include "system.h"
#include "coretypes.h"
#include "diagnostic.h"
#include "selftest.h"
#include "selftest-json.h"
/* The selftest code should entirely disappear in a production
configuration, hence we guard all of it with #if CHECKING_P. */
#if CHECKING_P
namespace selftest {
/* Assert that VALUE is a non-null json::object,
returning it as such, failing at LOC if this isn't the case. */
const json::object *
expect_json_object (const location &loc,
const json::value *value)
{
ASSERT_NE_AT (loc, value, nullptr);
ASSERT_EQ_AT (loc, value->get_kind (), json::JSON_OBJECT);
return static_cast<const json::object *> (value);
}
/* Assert that VALUE is a non-null json::object that has property
PROPERTY_NAME.
Return the value of the property.
Use LOC for any failures. */
const json::value *
expect_json_object_with_property (const location &loc,
const json::value *value,
const char *property_name)
{
const json::object *obj = expect_json_object (loc, value);
const json::value *property_value = obj->get (property_name);
ASSERT_NE_AT (loc, property_value, nullptr);
return property_value;
}
/* Assert that VALUE is a non-null json::object that has property
PROPERTY_NAME, and that the value of that property is a non-null
json::integer_number equalling EXPECTED_VALUE.
Use LOC for any failures. */
void
assert_json_int_property_eq (const location &loc,
const json::value *value,
const char *property_name,
long expected_value)
{
const json::value *property_value
= expect_json_object_with_property (loc, value, property_name);
ASSERT_EQ_AT (loc, property_value->get_kind (), json::JSON_INTEGER);
long actual_value
= static_cast<const json::integer_number *> (property_value)->get ();
ASSERT_EQ_AT (loc, expected_value, actual_value);
}
/* Assert that VALUE is a non-null json::object that has property
PROPERTY_NAME, and that the property value is a non-null JSON object.
Return the value of the property as a json::object.
Use LOC for any failures. */
const json::object *
expect_json_object_with_object_property (const location &loc,
const json::value *value,
const char *property_name)
{
const json::value *property_value
= expect_json_object_with_property (loc, value, property_name);
ASSERT_EQ_AT (loc, property_value->get_kind (), json::JSON_OBJECT);
return static_cast<const json::object *> (property_value);
}
/* Assert that VALUE is a non-null json::object that has property
PROPERTY_NAME, and that the value of that property is a non-null
JSON string equalling EXPECTED_VALUE.
Use LOC for any failures. */
void
assert_json_string_property_eq (const location &loc,
const json::value *value,
const char *property_name,
const char *expected_value)
{
const json::value *property_value
= expect_json_object_with_property (loc, value, property_name);
ASSERT_EQ_AT (loc, property_value->get_kind (), json::JSON_STRING);
const json::string *str = static_cast<const json::string *> (property_value);
ASSERT_STREQ_AT (loc, expected_value, str->get_string ());
}
} // namespace selftest
#endif /* #if CHECKING_P */

100
gcc/selftest-json.h Normal file
View file

@ -0,0 +1,100 @@
/* Selftest support for JSON.
Copyright (C) 2024 Free Software Foundation, Inc.
Contributed by David Malcolm <dmalcolm@redhat.com>.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
#ifndef GCC_SELFTEST_JSON_H
#define GCC_SELFTEST_JSON_H
#include "json.h"
/* The selftest code should entirely disappear in a production
configuration, hence we guard all of it with #if CHECKING_P. */
#if CHECKING_P
namespace selftest {
/* Assert that VALUE is a non-null json::object,
returning it as such, failing at LOC if this isn't the case. */
const json::object *
expect_json_object (const location &loc,
const json::value *value);
/* Assert that VALUE is a non-null json::object that has property
PROPERTY_NAME.
Return the value of the property.
Use LOC for any failures. */
const json::value *
expect_json_object_with_property (const location &loc,
const json::value *value,
const char *property_name);
/* Assert that VALUE is a non-null json::object that has property
PROPERTY_NAME, and that the value of that property is a non-null
json::integer_number equalling EXPECTED_VALUE.
Use LOC for any failures. */
void
assert_json_int_property_eq (const location &loc,
const json::value *value,
const char *property_name,
long expected_value);
#define ASSERT_JSON_INT_PROPERTY_EQ(JSON_VALUE, PROPERTY_NAME, EXPECTED_VALUE) \
assert_json_int_property_eq ((SELFTEST_LOCATION), \
(JSON_VALUE), \
(PROPERTY_NAME), \
(EXPECTED_VALUE))
/* Assert that VALUE is a non-null json::object that has property
PROPERTY_NAME, and that the property value is a non-null JSON object.
Return the value of the property as a json::object.
Use LOC for any failures. */
const json::object *
expect_json_object_with_object_property (const location &loc,
const json::value *value,
const char *property_name);
#define EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY(JSON_VALUE, PROPERTY_NAME) \
expect_json_object_with_object_property ((SELFTEST_LOCATION), \
(JSON_VALUE), \
(PROPERTY_NAME))
/* Assert that VALUE is a non-null json::object that has property
PROPERTY_NAME, and that the value of that property is a non-null
JSON string equalling EXPECTED_VALUE.
Use LOC for any failures. */
void
assert_json_string_property_eq (const location &loc,
const json::value *value,
const char *property_name,
const char *expected_value);
#define ASSERT_JSON_STRING_PROPERTY_EQ(JSON_VALUE, PROPERTY_NAME, EXPECTED_VALUE) \
assert_json_string_property_eq ((SELFTEST_LOCATION), \
(JSON_VALUE), \
(PROPERTY_NAME), \
(EXPECTED_VALUE))
} // namespace selftest
#endif /* #if CHECKING_P */
#endif /* GCC_SELFTEST_JSON_H */

View file

@ -97,6 +97,7 @@ selftest::run_tests ()
diagnostic_color_cc_tests ();
diagnostic_show_locus_cc_tests ();
diagnostic_format_json_cc_tests ();
diagnostic_format_sarif_cc_tests ();
edit_context_cc_tests ();
fold_const_cc_tests ();
spellcheck_cc_tests ();

View file

@ -222,6 +222,7 @@ extern void cgraph_cc_tests ();
extern void convert_cc_tests ();
extern void diagnostic_color_cc_tests ();
extern void diagnostic_format_json_cc_tests ();
extern void diagnostic_format_sarif_cc_tests ();
extern void diagnostic_path_cc_tests ();
extern void diagnostic_show_locus_cc_tests ();
extern void digraph_cc_tests ();

View file

@ -20,4 +20,13 @@ int main() {
{ dg-final { scan-sarif-file {"text": "unpaired UTF-8 bidirectional control characters detected"} } }
{ dg-final { scan-sarif-file {"text": "unpaired UTF-8 bidirectional control characters detected"} } }
Verify that the expected property bag property is present.
{ dg-final { scan-sarif-file {"gcc/escapeNonAscii": true} } }
Verify that the snippets have a "rendered" property.
We check the contents of the property via a selftest.
{ dg-final { scan-sarif-file {"rendered": } } }
*/

View file

@ -61,6 +61,7 @@
#include "context.h"
#include "print-tree.h"
#include "gcc-rich-location.h"
#include "text-range-label.h"
int plugin_is_GPL_compatible;

42
gcc/text-range-label.h Normal file
View file

@ -0,0 +1,42 @@
/* Simple implementation of range_label.
Copyright (C) 2014-2024 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
#ifndef GCC_TEXT_RANGE_LABEL_H
#define GCC_TEXT_RANGE_LABEL_H
#include "rich-location.h"
/* Concrete subclass of libcpp's range_label.
Simple implementation using a string literal. */
class text_range_label : public range_label
{
public:
text_range_label (const char *text) : m_text (text) {}
label_text get_text (unsigned /*range_idx*/) const final override
{
return label_text::borrow (m_text);
}
private:
const char *m_text;
};
#endif /* GCC_TEXT_RANGE_LABEL_H */

View file

@ -91,6 +91,7 @@ class semi_embedded_vec
public:
semi_embedded_vec ();
~semi_embedded_vec ();
semi_embedded_vec (const semi_embedded_vec &other);
unsigned int count () const { return m_num; }
T& operator[] (int idx);
@ -115,6 +116,21 @@ semi_embedded_vec<T, NUM_EMBEDDED>::semi_embedded_vec ()
{
}
/* Copy constructor for semi_embedded_vec. */
template <typename T, int NUM_EMBEDDED>
semi_embedded_vec<T, NUM_EMBEDDED>::semi_embedded_vec (const semi_embedded_vec &other)
: m_num (0),
m_alloc (other.m_alloc),
m_extra (nullptr)
{
if (other.m_extra)
m_extra = XNEWVEC (T, m_alloc);
for (int i = 0; i < other.m_num; i++)
push (other[i]);
}
/* semi_embedded_vec's dtor. Release any dynamically-allocated memory. */
template <typename T, int NUM_EMBEDDED>
@ -387,11 +403,10 @@ class rich_location
/* Destructor. */
~rich_location ();
/* The class manages the memory pointed to by the elements of
the M_FIXIT_HINTS vector and is not meant to be copied or
assigned. */
rich_location (const rich_location &) = delete;
void operator= (const rich_location &) = delete;
rich_location (const rich_location &);
rich_location (rich_location &&) = delete;
rich_location &operator= (const rich_location &) = delete;
rich_location &operator= (rich_location &&) = delete;
/* Accessors. */
location_t get_loc () const { return get_loc (0); }
@ -547,6 +562,8 @@ protected:
mutable expanded_location m_expanded_location;
/* The class manages the memory pointed to by the elements of
the m_fixit_hints vector. */
static const int MAX_STATIC_FIXIT_HINTS = 2;
semi_embedded_vec <fixit_hint *, MAX_STATIC_FIXIT_HINTS> m_fixit_hints;
@ -605,7 +622,11 @@ class fixit_hint
fixit_hint (location_t start,
location_t next_loc,
const char *new_content);
fixit_hint (const fixit_hint &other);
fixit_hint (fixit_hint &&other) = delete;
~fixit_hint () { free (m_bytes); }
fixit_hint &operator= (const fixit_hint &) = delete;
fixit_hint &operator= (fixit_hint &&) = delete;
bool affects_line_p (const line_maps *set,
const char *file,

View file

@ -2175,6 +2175,26 @@ rich_location::rich_location (line_maps *set, location_t loc,
add_range (loc, SHOW_RANGE_WITH_CARET, label, label_highlight_color);
}
/* Copy ctor for rich_location.
Take a deep copy of the fixit hints, which are owneed;
everything else is borrowed. */
rich_location::rich_location (const rich_location &other)
: m_line_table (other.m_line_table),
m_ranges (other.m_ranges),
m_column_override (other.m_column_override),
m_have_expanded_location (other.m_have_expanded_location),
m_seen_impossible_fixit (other.m_seen_impossible_fixit),
m_fixits_cannot_be_auto_applied (other.m_fixits_cannot_be_auto_applied),
m_escape_on_output (other.m_escape_on_output),
m_expanded_location (other.m_expanded_location),
m_fixit_hints (),
m_path (other.m_path)
{
for (unsigned i = 0; i < other.m_fixit_hints.count (); i++)
m_fixit_hints.push (new fixit_hint (*other.m_fixit_hints[i]));
}
/* The destructor for class rich_location. */
rich_location::~rich_location ()
@ -2595,6 +2615,14 @@ fixit_hint::fixit_hint (location_t start,
{
}
fixit_hint::fixit_hint (const fixit_hint &other)
: m_start (other.m_start),
m_next_loc (other.m_next_loc),
m_bytes (xstrdup (other.m_bytes)),
m_len (other.m_len)
{
}
/* Does this fix-it hint affect the given line? */
bool