On-demand locations within string-literals

gcc/c-family/ChangeLog:
	* c-common.c: Include "substring-locations.h".
	(get_cpp_ttype_from_string_type): New function.
	(g_string_concat_db): New global.
	(substring_loc::get_range): New method.
	* c-common.h (g_string_concat_db): New declaration.
	(class substring_loc): New class.
	* c-lex.c (lex_string): When concatenating strings, capture the
	locations of all tokens using a new obstack, and record the
	concatenation locations within g_string_concat_db.
	* c-opts.c (c_common_init_options): Construct g_string_concat_db
	on the ggc-heap.

gcc/ChangeLog:
	* input.c (string_concat::string_concat): New constructor.
	(string_concat_db::string_concat_db): New constructor.
	(string_concat_db::record_string_concatenation): New method.
	(string_concat_db::get_string_concatenation): New method.
	(string_concat_db::get_key_loc): New method.
	(class auto_cpp_string_vec): New class.
	(get_substring_ranges_for_loc): New function.
	(get_source_range_for_substring): New function.
	(get_num_source_ranges_for_substring): New function.
	(class selftest::lexer_test_options): New class.
	(struct selftest::lexer_test): New struct.
	(class selftest::ebcdic_execution_charset): New class.
	(selftest::ebcdic_execution_charset::s_singleton): New variable.
	(selftest::lexer_test::lexer_test): New constructor.
	(selftest::lexer_test::~lexer_test): New destructor.
	(selftest::lexer_test::get_token): New method.
	(selftest::assert_char_at_range): New function.
	(ASSERT_CHAR_AT_RANGE): New macro.
	(selftest::assert_num_substring_ranges): New function.
	(ASSERT_NUM_SUBSTRING_RANGES): New macro.
	(selftest::assert_has_no_substring_ranges): New function.
	(ASSERT_HAS_NO_SUBSTRING_RANGES): New macro.
	(selftest::test_lexer_string_locations_simple): New function.
	(selftest::test_lexer_string_locations_ebcdic): New function.
	(selftest::test_lexer_string_locations_hex): New function.
	(selftest::test_lexer_string_locations_oct): New function.
	(selftest::test_lexer_string_locations_letter_escape_1): New function.
	(selftest::test_lexer_string_locations_letter_escape_2): New function.
	(selftest::test_lexer_string_locations_ucn4): New function.
	(selftest::test_lexer_string_locations_ucn8): New function.
	(selftest::uint32_from_big_endian): New function.
	(selftest::test_lexer_string_locations_wide_string): New function.
	(selftest::uint16_from_big_endian): New function.
	(selftest::test_lexer_string_locations_string16): New function.
	(selftest::test_lexer_string_locations_string32): New function.
	(selftest::test_lexer_string_locations_u8): New function.
	(selftest::test_lexer_string_locations_utf8_source): New function.
	(selftest::test_lexer_string_locations_concatenation_1): New
	function.
	(selftest::test_lexer_string_locations_concatenation_2): New
	function.
	(selftest::test_lexer_string_locations_concatenation_3): New
	function.
	(selftest::test_lexer_string_locations_macro): New function.
	(selftest::test_lexer_string_locations_stringified_macro_argument):
	New function.
	(selftest::test_lexer_string_locations_non_string): New function.
	(selftest::test_lexer_string_locations_long_line): New function.
	(selftest::test_lexer_char_constants): New function.
	(selftest::input_c_tests): Call the new test functions once per
	case within the line_table test matrix.
	* input.h (struct string_concat): New struct.
	(struct location_hash): New struct.
	(class string_concat_db): New class.
	* substring-locations.h: New header.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: New file.
	* gcc.dg/plugin/diagnostic-test-string-literals-2.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_string_literals.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above new files.

libcpp/ChangeLog:
	* charset.c (cpp_substring_ranges::cpp_substring_ranges): New
	constructor.
	(cpp_substring_ranges::~cpp_substring_ranges): New destructor.
	(cpp_substring_ranges::add_range): New method.
	(cpp_substring_ranges::add_n_ranges): New method.
	(_cpp_valid_ucn): Add "char_range" and "loc_reader" params; if
	they are non-NULL, read position information from *loc_reader
	and update char_range->m_finish accordingly.
	(convert_ucn): Add "char_range", "loc_reader", and "ranges"
	params.  If loc_reader is non-NULL, read location information from
	it, and update *ranges accordingly, using char_range.
	Conditionalize the conversion into tbuf on tbuf being non-NULL.
	(convert_hex): Likewise, conditionalizing the call to
	emit_numeric_escape on tbuf.
	(convert_oct): Likewise.
	(convert_escape): Add params "loc_reader" and "ranges".  If
	loc_reader is non-NULL, read location information from it, and
	update *ranges accordingly.  Conditionalize the conversion into
	tbuf on tbuf being non-NULL.
	(cpp_interpret_string): Rename to...
	(cpp_interpret_string_1): ...this, adding params "loc_readers" and
	"out".  Use "to" to conditionalize the initialization and usage of
	"tbuf", such as running the converter.  If "loc_readers" is
	non-NULL, use the instances within it, reading location
	information from them, and passing them to convert_escape; likewise
	write to "out" if loc_readers is non-NULL.  Check for leading
	quote and issue an error if it is not present.  Update boundary
	check from "== limit" to ">= limit" to protect against erroneous
	location values to calls that are not parsing string literals.
	(cpp_interpret_string): Reimplement in terms to
	cpp_interpret_string_1.
	(noop_error_cb): New function.
	(cpp_interpret_string_ranges): New function.
	(cpp_string_location_reader::cpp_string_location_reader): New
	constructor.
	(cpp_string_location_reader::get_next): New method.
	* include/cpplib.h (class cpp_string_location_reader): New class.
	(class cpp_substring_ranges): New class.
	(cpp_interpret_string_ranges): New prototype.
	* internal.h (_cpp_valid_ucn): Add params "char_range" and
	"loc_reader".
	* lex.c (forms_identifier_p): Pass NULL for new params to
	_cpp_valid_ucn.

From-SVN: r239175
This commit is contained in:
David Malcolm 2016-08-05 18:08:33 +00:00 committed by David Malcolm
parent 1addb9e62b
commit 88fa5555a3
19 changed files with 2767 additions and 56 deletions

View file

@ -743,6 +743,51 @@ struct GTY(()) cpp_hashnode {
union _cpp_hashnode_value GTY ((desc ("CPP_HASHNODE_VALUE_IDX (%1)"))) value;
};
/* A class for iterating through the source locations within a
string token (before escapes are interpreted, and before
concatenation). */
class cpp_string_location_reader {
public:
cpp_string_location_reader (source_location src_loc,
line_maps *line_table);
source_range get_next ();
private:
source_location m_loc;
int m_offset_per_column;
line_maps *m_line_table;
};
/* A class for storing the source ranges of all of the characters within
a string literal, after escapes are interpreted, and after
concatenation.
This is not GTY-marked, as instances are intended to be temporary. */
class cpp_substring_ranges
{
public:
cpp_substring_ranges ();
~cpp_substring_ranges ();
int get_num_ranges () const { return m_num_ranges; }
source_range get_range (int idx) const
{
linemap_assert (idx < m_num_ranges);
return m_ranges[idx];
}
void add_range (source_range range);
void add_n_ranges (int num, cpp_string_location_reader &loc_reader);
private:
source_range *m_ranges;
int m_num_ranges;
int m_alloc_ranges;
};
/* Call this first to get a handle to pass to other functions.
If you want cpplib to manage its own hashtable, pass in a NULL
@ -829,6 +874,12 @@ extern cppchar_t cpp_interpret_charconst (cpp_reader *, const cpp_token *,
extern bool cpp_interpret_string (cpp_reader *,
const cpp_string *, size_t,
cpp_string *, enum cpp_ttype);
extern const char *cpp_interpret_string_ranges (cpp_reader *pfile,
const cpp_string *from,
cpp_string_location_reader *,
size_t count,
cpp_substring_ranges *out,
enum cpp_ttype type);
extern bool cpp_interpret_string_notranslate (cpp_reader *,
const cpp_string *, size_t,
cpp_string *, enum cpp_ttype);