Change treesit-parser-list from variable to function

Effectively making the list internal.  Now Emacs user cannot shoot
themselves in the foot by removing a parser from the list, make
chaanges to buffer and add that parser back to the list.

* doc/lispref/parsing.texi (Language Definitions, Using Parser)
(Retrieving Node, Multiple Languages): Change variable to function.
* lisp/treesit.el (treesit-language-at, treesit-node-on)
(treesit-buffer-root-node, treesit-indent, treesit-check-indent)
(treesit-search-forward, treesit-search-beginning)
(treesit-end-of-defun, treesit-inspect-mode): Change variable to
function.
* src/buffer.c (bset_ts_parser_list, reset_buffer, init_buffer_once):
Add ts_parser_list.
* src/buffer.h (struct buffer): Add ts_parser_list.
* src/treesit.c (ts_record_change, Ftreesit_parser_create): Use the
buffer field instead of the old buffer local variable.
(Ftreesit_parser_delete, Ftreesit_parser_list): New functions.
(syms_of_treesit): Remove treesit-parser-list.
* test/src/treesit-tests.el (treesit-basic-parsing): Use the new
function.
This commit is contained in:
Yuan Fu 2022-06-16 17:55:07 -07:00
parent 33f7e10a29
commit 246dbb540a
6 changed files with 107 additions and 51 deletions

View file

@ -184,7 +184,7 @@ of a node, then the mode-line only displays the smallest node that
spans point, and its immediate parent.
This minor mode doesn't create parsers on its own. It simply uses the
first parser in @var{treesit-parser-list} (@pxref{Using Parser}).
first parser in @code{(treesit-parser-list)} (@pxref{Using Parser}).
@end deffn
@heading Reading the grammar definition
@ -407,15 +407,19 @@ Tree-sitter can only handle buffer no larger than about 4GB. If the
size exceeds that, Emacs signals @var{treesit-buffer-too-large}
with signal data being the buffer size.
@vindex treesit-parser-list
Once a parser is created, Emacs automatically adds it to the
buffer-local variable @var{treesit-parser-list}. Every time a
change is made to the buffer, Emacs updates parsers in this list so
they can update their syntax tree incrementally. Therefore, one must
not remove parsers from this list and put the parser back in: if any
change is made when that parser is absent, the parser will be
permanently out-of-sync with the buffer content, and shouldn't be used
anymore.
buffer-local parser list. Every time a change is made to the buffer,
Emacs updates parsers in this list so they can update their syntax
tree incrementally.
@defun treesit-parser-list &optional buffer
This function returns the parser list of @var{buffer}. And
@var{buffer} defaults to the current buffer.
@end defun
@defun treesit-parser-delete parser
This function deletes @var{parser}.
@end defun
@cindex tree-sitter narrowing
@anchor{tree-sitter narrowing} Normally, a parser ``sees'' the whole
@ -477,10 +481,10 @@ the @var{point}. In other words, the start of the node is equal or
greater than @var{point}.
When @var{parser-or-lang} is nil, this function uses the first parser
in @var{treesit-parser-list} in the current buffer. If
in @code{(treesit-parser-list)} in the current buffer. If
@var{parser-or-lang} is a parser object, it use that parser; if
@var{parser-or-lang} is a language, it finds the first parser using
that language in @var{treesit-parser-list} and use that.
that language in @code{(treesit-parser-list)} and use that.
If @var{named} is non-nil, this function looks for a named node
instead (@pxref{tree-sitter named node, named node}).
@ -507,10 +511,10 @@ smallest node that covers that empty line. You probably want to use
@code{treesit-node-at} instead.
When @var{parser-or-lang} is nil, this function uses the first parser
in @var{treesit-parser-list} in the current buffer. If
in @code{(treesit-parser-list)} in the current buffer. If
@var{parser-or-lang} is a parser object, it use that parser; if
@var{parser-or-lang} is a language, it finds the first parser using
that language in @var{treesit-parser-list} and use that.
that language in @code{(treesit-parser-list)} and use that.
If @var{named} is non-nil, this function looks for a named node
instead (@pxref{tree-sitter named node, named node}).
@ -523,7 +527,7 @@ This function returns the root node of the syntax tree generated by
@defun treesit-buffer-root-node &optional language
This function finds the first parser that uses @var{language} in
@var{treesit-parser-list} in the current buffer, and returns the
@code{(treesit-parser-list)} in the current buffer, and returns the
root node of that buffer. If it cannot find an appropriate parser, it
returns nil.
@end defun
@ -1267,7 +1271,7 @@ Like @code{treesit-parser-set-included-ranges}, this function sets
the ranges of @var{parser-or-lang} to @var{ranges}. Conveniently,
@var{parser-or-lang} could be either a parser or a language. If it is
a language, this function looks for the first parser in
@var{treesit-parser-list} for that language in the current buffer,
@code{(treesit-parser-list)} for that language in the current buffer,
and set range for it.
@end defun
@ -1301,7 +1305,7 @@ Like other query functions, this function raises an
@defun treesit-language-at point
This function tries to figure out which language is responsible for
the text at @var{point}. It goes over each parser in
@var{treesit-parser-list} and see if that parser's range covers
@code{(treesit-parser-list)} and see if that parser's range covers
@var{point}.
@end defun

View file

@ -75,7 +75,7 @@ Return the root node of the syntax tree."
(defun treesit-language-at (point)
"Return the language used at POINT."
(cl-loop for parser in treesit-parser-list
(cl-loop for parser in (treesit-parser-list)
if (treesit-node-on point point parser)
return (treesit-parser-language parser)))
@ -122,7 +122,7 @@ greater or larger than POINT. Return nil if none find. If NAMED
non-nil, only look for named node.
If PARSER-OR-LANG is nil, use the first parser in
`treesit-parser-list'; if PARSER-OR-LANG is a parser, use
(`treesit-parser-list'); if PARSER-OR-LANG is a parser, use
that parser; if PARSER-OR-LANG is a language, find a parser using
that language in the current buffer, and use that."
(let ((node (if (treesit-parser-p parser-or-lang)
@ -150,7 +150,7 @@ Return nil if none find. If NAMED non-nil, only look for named
node.
If PARSER-OR-LANG is nil, use the first parser in
`treesit-parser-list'; if PARSER-OR-LANG is a parser, use
(`treesit-parser-list'); if PARSER-OR-LANG is a parser, use
that parser; if PARSER-OR-LANG is a language, find a parser using
that language in the current buffer, and use that."
(let ((root (if (treesit-parser-p parser-or-lang)
@ -160,13 +160,13 @@ that language in the current buffer, and use that."
(defun treesit-buffer-root-node (&optional language)
"Return the root node of the current buffer.
Use the first parser in `treesit-parser-list', if LANGUAGE is
Use the first parser in (`treesit-parser-list'), if LANGUAGE is
non-nil, use the first parser for LANGUAGE."
(if-let ((parser
(or (if language
(or (treesit-parser-create language)
(error "Cannot find a parser for %s" language))
(or (car treesit-parser-list)
(or (car (treesit-parser-list))
(error "Buffer has no parser"))))))
(treesit-parser-root-node parser)))
@ -770,7 +770,7 @@ of the current line.")
(skip-chars-forward " \t")
(point)))
(smallest-node
(cl-loop for parser in treesit-parser-list
(cl-loop for parser in (treesit-parser-list)
for node = (treesit-node-at bol parser)
if node return node))
(node (treesit-parent-while
@ -856,7 +856,7 @@ This is a more primitive function, you might want to use
QUERY has to capture the node to match. LANG specifies the
language in which we search for nodes. If LANG is nil, use the
first parser in `treesit-parser-list'.
first parser in (`treesit-parser-list').
Move forward/backward ARG times, positive ARG means go forward,
negative ARG means go backward.
@ -875,7 +875,7 @@ the tree."
(cl-loop for idx from 1 to (abs arg)
for parser = (if lang
(treesit-parser-create lang)
(car treesit-parser-list))
(car (treesit-parser-list)))
for node =
(if-let ((starting-point (point))
(node (treesit-node-at (point) parser t)))
@ -914,7 +914,7 @@ Stops at the beginning of matched node.
QUERY has to capture the node to match. LANG specifies the
language in which we search for nodes. If LANG is nil, use the
first parser in `treesit-parser-list'.
first parser in (`treesit-parser-list').
Move forward/backward ARG times, positive ARG means go forward,
negative ARG means go backward.
@ -937,7 +937,7 @@ Stops at the end of matched node.
QUERY has to capture the node to match. LANG specifies the
language in which we search for nodes. If LANG is nil, use the
first parser in `treesit-parser-list'.
first parser in (`treesit-parser-list').
Move forward/backward ARG times, positive ARG means go forward,
negative ARG means go backward.
@ -993,7 +993,7 @@ ARGth preceding end of defun. Defun is defined according to
If called interactively, show in echo area, otherwise set
`treesit--inspect-name' (which will appear in the mode-line
if `treesit-inspect-mode' is enabled). Uses the first parser
in `treesit-parser-list'."
in (`treesit-parser-list')."
(interactive "p")
;; NODE-LIST contains all the node that starts at point.
(let* ((node-list
@ -1053,7 +1053,7 @@ node, then we just display the smallest node that spans point and
its immediate parent.
This minor mode doesn't create parsers on its own. It simply
uses the first parser in `treesit-parser-list'."
uses the first parser in (`treesit-parser-list')."
:lighter nil
(if treesit-inspect-mode
(progn

View file

@ -231,6 +231,13 @@ bset_extra_line_spacing (struct buffer *b, Lisp_Object val)
{
b->extra_line_spacing_ = val;
}
#ifdef HAVE_TREE_SITTER
static void
bset_ts_parser_list (struct buffer *b, Lisp_Object val)
{
b->ts_parser_list_ = val;
}
#endif
static void
bset_file_format (struct buffer *b, Lisp_Object val)
{
@ -1004,6 +1011,9 @@ reset_buffer (register struct buffer *b)
(b, BVAR (&buffer_defaults, enable_multibyte_characters));
bset_cursor_type (b, BVAR (&buffer_defaults, cursor_type));
bset_extra_line_spacing (b, BVAR (&buffer_defaults, extra_line_spacing));
#ifdef HAVE_TREE_SITTER
bset_ts_parser_list (b, Qnil);
#endif
b->display_error_modiff = 0;
}
@ -5273,6 +5283,9 @@ init_buffer_once (void)
XSETFASTINT (BVAR (&buffer_local_flags, tab_line_format), idx); ++idx;
XSETFASTINT (BVAR (&buffer_local_flags, cursor_type), idx); ++idx;
XSETFASTINT (BVAR (&buffer_local_flags, extra_line_spacing), idx); ++idx;
#ifdef HAVE_TREE_SITTER
XSETFASTINT (BVAR (&buffer_local_flags, ts_parser_list), idx); ++idx;
#endif
XSETFASTINT (BVAR (&buffer_local_flags, cursor_in_non_selected_windows), idx); ++idx;
/* buffer_local_flags contains no pointers, so it's safe to treat it
@ -5343,6 +5356,9 @@ init_buffer_once (void)
bset_bidi_paragraph_separate_re (&buffer_defaults, Qnil);
bset_cursor_type (&buffer_defaults, Qt);
bset_extra_line_spacing (&buffer_defaults, Qnil);
#ifdef HAVE_TREE_SITTER
bset_ts_parser_list (&buffer_defaults, Qnil);
#endif
bset_cursor_in_non_selected_windows (&buffer_defaults, Qt);
bset_enable_multibyte_characters (&buffer_defaults, Qt);

View file

@ -561,6 +561,10 @@ struct buffer
in the display of this buffer. */
Lisp_Object extra_line_spacing_;
#ifdef HAVE_TREE_SITTER
/* A list of tree-sitter parsers for this buffer. */
Lisp_Object ts_parser_list_;
#endif
/* Cursor type to display in non-selected windows.
t means to use hollow box cursor.
See `cursor-type' for other values. */

View file

@ -342,8 +342,8 @@ void
ts_record_change (ptrdiff_t start_byte, ptrdiff_t old_end_byte,
ptrdiff_t new_end_byte)
{
for (Lisp_Object parser_list =
Fsymbol_value (Qtreesit_parser_list);
for (Lisp_Object parser_list
= BVAR (current_buffer, ts_parser_list);
!NILP (parser_list);
parser_list = XCDR (parser_list))
{
@ -704,23 +704,24 @@ parser. If NO-REUSE is non-nil, always create a new parser. */)
ts_initialize ();
CHECK_SYMBOL (language);
struct buffer *old_buffer = current_buffer;
if (!NILP (buffer))
struct buffer *buf;
if (NILP (buffer))
buf = current_buffer;
else
{
CHECK_BUFFER (buffer);
set_buffer_internal (XBUFFER (buffer));
buf = XBUFFER (buffer);
}
ts_check_buffer_size (current_buffer);
ts_check_buffer_size (buf);
/* See if we can reuse a parser. */
for (Lisp_Object tail = Fsymbol_value (Qtreesit_parser_list);
for (Lisp_Object tail = BVAR (buf, ts_parser_list);
NILP (no_reuse) && !NILP (tail);
tail = XCDR (tail))
{
struct Lisp_TS_Parser *parser = XTS_PARSER (XCAR (tail));
if (EQ (parser->language_symbol, language))
{
set_buffer_internal (old_buffer);
return XCAR (tail);
}
}
@ -734,13 +735,53 @@ parser. If NO-REUSE is non-nil, always create a new parser. */)
Lisp_Object lisp_parser
= make_ts_parser (Fcurrent_buffer (), parser, NULL, language);
Fset (Qtreesit_parser_list,
Fcons (lisp_parser, Fsymbol_value (Qtreesit_parser_list)));
BVAR (buf, ts_parser_list)
= Fcons (lisp_parser, BVAR (buf, ts_parser_list));
set_buffer_internal (old_buffer);
return lisp_parser;
}
DEFUN ("treesit-parser-delete",
Ftreesit_parser_delete, Streesit_parser_delete,
1, 1, 0,
doc: /* Delete PARSER from its buffer. */)
(Lisp_Object parser)
{
CHECK_TS_PARSER (parser);
Lisp_Object buffer = XTS_PARSER (parser)->buffer;
struct buffer *buf = XBUFFER (buffer);
BVAR (buf, ts_parser_list)
= Fdelete (parser, BVAR (buf, ts_parser_list));
return Qnil;
}
DEFUN ("treesit-parser-list",
Ftreesit_parser_list, Streesit_parser_list,
0, 1, 0,
doc: /* Return BUFFER's parser list.
BUFFER defaults to the current buffer. */)
(Lisp_Object buffer)
{
struct buffer *buf;
if (NILP (buffer))
buf = current_buffer;
else
{
CHECK_BUFFER (buffer);
buf = XBUFFER (buffer);
}
/* Return a fresh list so messing with that list doesn't affect our
internal data. */
Lisp_Object return_list = Qnil;
for (Lisp_Object tail = BVAR (buf, ts_parser_list);
!NILP (tail);
tail = XCDR (tail))
{
return_list = Fcons (XCAR (tail), return_list);
}
return Freverse (return_list);
}
DEFUN ("treesit-parser-buffer",
Ftreesit_parser_buffer, Streesit_parser_buffer,
1, 1, 0,
@ -1799,17 +1840,6 @@ syms_of_treesit (void)
"This node is outdated, please retrieve a new one",
Qtreesit_error);
DEFSYM (Qtreesit_parser_list, "treesit-parser-list");
DEFVAR_LISP ("treesit-parser-list", Vtreesit_parser_list,
doc: /* A list of tree-sitter parsers.
If you removed a parser from this list, do not put it back in. Emacs
keeps the parser in this list updated with any change in the buffer.
If removed and put back in, there is no guarantee that the parser is in
sync with the buffer's content. */);
Vtreesit_parser_list = Qnil;
Fmake_variable_buffer_local (Qtreesit_parser_list);
DEFVAR_LISP ("treesit-load-name-override-list",
Vtreesit_load_name_override_list,
doc:
@ -1848,6 +1878,8 @@ dynamic libraries, in that order. */);
defsubr (&Streesit_node_parser);
defsubr (&Streesit_parser_create);
defsubr (&Streesit_parser_delete);
defsubr (&Streesit_parser_list);
defsubr (&Streesit_parser_buffer);
defsubr (&Streesit_parser_language);

View file

@ -27,7 +27,7 @@
(with-temp-buffer
(let ((parser (treesit-parser-create 'json)))
(should
(eq parser (car treesit-parser-list)))
(eq parser (car (treesit-parser-list))))
(should
(equal (treesit-node-string
(treesit-parser-root-node parser))