mirror of
https://github.com/masscollaborationlabs/emacs.git
synced 2025-07-04 11:23:24 +00:00
Improve documentation of treesit "thing"
* src/treesit.c (syms_of_treesit): * lisp/treesit.el (treesit-cycle-sexp-type): (treesit-thing-at, treesit-thing-at-point): Doc fixes. * doc/lispref/parsing.texi (User-defined Things): Improve documentation of treesit "thing" and related functions; add cross-references and indexing.
This commit is contained in:
parent
1903b0062b
commit
bcf005fa77
3 changed files with 80 additions and 52 deletions
|
@ -1619,14 +1619,16 @@ documentation about pattern-matching. The documentation can be found at
|
||||||
|
|
||||||
It's often useful to be able to identify and find certain @dfn{things} in
|
It's often useful to be able to identify and find certain @dfn{things} in
|
||||||
a buffer, like function and class definitions, statements, code blocks,
|
a buffer, like function and class definitions, statements, code blocks,
|
||||||
strings, comments, etc. Emacs allows users to define what kind of
|
strings, comments, etc., in terms of node types defined by the
|
||||||
tree-sitter node corresponds to a ``thing''. This enables handy
|
tree-sitter grammar used in the buffer. Emacs allows Lisp programs to
|
||||||
features like jumping to the next function, marking the code block at
|
define what kinds of tree-sitter nodes corresponds to each ``thing''.
|
||||||
point, or transposing two function arguments.
|
This enables handy features like jumping to the next function, marking
|
||||||
|
the code block at point, transposing two function arguments, etc.
|
||||||
|
|
||||||
The ``things'' feature in Emacs is independent of the pattern matching
|
The ``things'' feature in Emacs is independent of the pattern matching
|
||||||
feature of tree-sitter, and comparatively less powerful, but more
|
feature of tree-sitter (@pxref{Pattern Matching}), and comparatively
|
||||||
suitable for navigation and traversing the parse tree.
|
less powerful, but more suitable for navigation and traversing the
|
||||||
|
buffer text in terms of the tree-sitter parse tree.
|
||||||
|
|
||||||
@findex treesit-thing-definition
|
@findex treesit-thing-definition
|
||||||
@findex treesit-thing-defined-p
|
@findex treesit-thing-defined-p
|
||||||
|
@ -1635,12 +1637,22 @@ predicate of a defined thing with @code{treesit-thing-definition}, and
|
||||||
test if a thing is defined with @code{treesit-thing-defined-p}.
|
test if a thing is defined with @code{treesit-thing-defined-p}.
|
||||||
|
|
||||||
@defvar treesit-thing-settings
|
@defvar treesit-thing-settings
|
||||||
This is an alist of thing definitions for each language. The key of
|
This is an alist of thing definitions for each language supported by the
|
||||||
each entry is a language symbol, and the value is a list of thing
|
grammar used in a buffer; it should be defined by the buffer's major
|
||||||
definitions of the form @w{@code{(@var{thing} @var{pred})}}, where
|
mode (the default value is @code{nil}). The key of each entry is a
|
||||||
@var{thing} is a symbol representing the thing, like @code{defun},
|
language symbol (e.g., @code{c} for C, @code{cpp} for C@t{++}, etc.),
|
||||||
@code{sexp}, or @code{sentence}; and @var{pred} specifies what kind of
|
and the value is a list of thing definitions of the form
|
||||||
tree-sitter node is this @var{thing}.
|
@w{@code{(@var{thing} @var{pred})}}, where @var{thing} is a symbol
|
||||||
|
representing the thing, and @var{pred} specifies what kinds of
|
||||||
|
tree-sitter nodes are considered as this @var{thing}.
|
||||||
|
|
||||||
|
@cindex @code{sexp}, treesit-defined thing
|
||||||
|
@cindex @code{list}, treesit-defined thing
|
||||||
|
The symbol used to define the @var{thing} can be anything meaningful for
|
||||||
|
the major mode: @code{defun}, @code{defclass}, @code{sentence},
|
||||||
|
@code{comment}, @code{string}, etc. To support tree-sitter based
|
||||||
|
navigation commands (@pxref{List Motion}), the mode should define two
|
||||||
|
things: @code{list} and @code{sexp}.
|
||||||
|
|
||||||
@var{pred} can be a regexp string that matches the type of the node; it
|
@var{pred} can be a regexp string that matches the type of the node; it
|
||||||
can be a function that takes a node as the argument and returns a
|
can be a function that takes a node as the argument and returns a
|
||||||
|
@ -1660,13 +1672,16 @@ meaning that not satisfying @var{pred} qualifies the node.
|
||||||
Finally, @var{pred} can refer to other @var{thing}s defined in this
|
Finally, @var{pred} can refer to other @var{thing}s defined in this
|
||||||
list. For example, @w{@code{(or sexp sentence)}} defines something
|
list. For example, @w{@code{(or sexp sentence)}} defines something
|
||||||
that's either a @code{sexp} thing or a @code{sentence} thing, as defined
|
that's either a @code{sexp} thing or a @code{sentence} thing, as defined
|
||||||
by some other rule in the alist.
|
by some other rules in the alist.
|
||||||
|
|
||||||
|
@cindex @code{named}, treesit-defined thing
|
||||||
|
@cindex @code{anonymous}, treesit-defined thing
|
||||||
There are two pre-defined predicates: @code{named} and @code{anonymous},
|
There are two pre-defined predicates: @code{named} and @code{anonymous},
|
||||||
which qualify, respectively, named and anonymous nodes. They can be
|
which qualify, respectively, named and anonymous nodes of the
|
||||||
combined with @code{and} to narrow down the match.
|
tree-sitter grammar. They can be combined with @code{and} to narrow
|
||||||
|
down the match.
|
||||||
|
|
||||||
Here's an example @code{treesit-thing-settings} for C and C++:
|
Here's an example @code{treesit-thing-settings} for C and C@t{++}:
|
||||||
|
|
||||||
@example
|
@example
|
||||||
@group
|
@group
|
||||||
|
@ -1676,6 +1691,8 @@ Here's an example @code{treesit-thing-settings} for C and C++:
|
||||||
(comment "comment")
|
(comment "comment")
|
||||||
(string "raw_string_literal")
|
(string "raw_string_literal")
|
||||||
(text (or comment string)))
|
(text (or comment string)))
|
||||||
|
@end group
|
||||||
|
@group
|
||||||
(cpp
|
(cpp
|
||||||
(defun ("function_definition" . cpp-ts-mode-defun-valid-p))
|
(defun ("function_definition" . cpp-ts-mode-defun-valid-p))
|
||||||
(defclass "class_specifier")
|
(defclass "class_specifier")
|
||||||
|
@ -1685,12 +1702,12 @@ Here's an example @code{treesit-thing-settings} for C and C++:
|
||||||
|
|
||||||
@noindent
|
@noindent
|
||||||
Note that this example is modified for didactic purposes, and isn't
|
Note that this example is modified for didactic purposes, and isn't
|
||||||
exactly how C and C@t{++} modes define things.
|
exactly how tree-sitter based C and C@t{++} modes define things.
|
||||||
@end defvar
|
@end defvar
|
||||||
|
|
||||||
Emacs builtin functions already make use some thing definitions.
|
Emacs builtin functions already make use of some thing definitions.
|
||||||
Command @code{treesit-forward-sexp} uses the @code{sexp} definition if
|
Command @code{treesit-forward-sexp} uses the @code{sexp} definition if
|
||||||
major mode defines it; @code{treesit-forward-list},
|
major mode defines it (@pxref{List Motion}); @code{treesit-forward-list},
|
||||||
@code{treesit-down-list}, @code{treesit-up-list},
|
@code{treesit-down-list}, @code{treesit-up-list},
|
||||||
@code{treesit-show-paren-data} use the @code{list} definition (its
|
@code{treesit-show-paren-data} use the @code{list} definition (its
|
||||||
symbol @code{list} has the symbol property @code{treesit-thing-symbol}
|
symbol @code{list} has the symbol property @code{treesit-thing-symbol}
|
||||||
|
@ -1699,8 +1716,8 @@ to avoid ambiguity with the function that has the same name);
|
||||||
Defun movement functions like @code{treesit-end-of-defun} uses the
|
Defun movement functions like @code{treesit-end-of-defun} uses the
|
||||||
@code{defun} definition (@code{defun} definition is overridden by
|
@code{defun} definition (@code{defun} definition is overridden by
|
||||||
@var{treesit-defun-type-regexp} for backward compatibility). Major
|
@var{treesit-defun-type-regexp} for backward compatibility). Major
|
||||||
modes can also define @code{comment}, @code{string}, @code{text}
|
modes can also define @code{comment}, @code{string}, and @code{text}
|
||||||
(generally comments and strings).
|
things (to match comments and strings).
|
||||||
|
|
||||||
The rest of this section lists a few functions that take advantage of
|
The rest of this section lists a few functions that take advantage of
|
||||||
the thing definitions. Besides the functions below, some other
|
the thing definitions. Besides the functions below, some other
|
||||||
|
@ -1709,10 +1726,10 @@ tree-traversing functions like @code{treesit-search-forward},
|
||||||
@code{treesit-induce-sparse-tree}, etc. @xref{Retrieving Nodes}.
|
@code{treesit-induce-sparse-tree}, etc. @xref{Retrieving Nodes}.
|
||||||
|
|
||||||
@defun treesit-node-match-p node thing &optional ignore-missing
|
@defun treesit-node-match-p node thing &optional ignore-missing
|
||||||
This function checks whether @var{node} is a @var{thing}.
|
This function checks whether @var{node} represents a @var{thing}.
|
||||||
|
|
||||||
If @var{node} is a @var{thing}, return non-@code{nil}, otherwise return
|
If @var{node} represents @var{thing}, return non-@code{nil}, otherwise
|
||||||
@code{nil}. For convenience, if @code{node} is @code{nil}, this
|
return @code{nil}. For convenience, if @code{node} is @code{nil}, this
|
||||||
function just returns @code{nil}.
|
function just returns @code{nil}.
|
||||||
|
|
||||||
The @var{thing} can be either a thing symbol like @code{defun}, or
|
The @var{thing} can be either a thing symbol like @code{defun}, or
|
||||||
|
@ -1727,8 +1744,9 @@ undefined and just returns @code{nil}; but it still signals the error if
|
||||||
@end defun
|
@end defun
|
||||||
|
|
||||||
@defun treesit-thing-prev position thing
|
@defun treesit-thing-prev position thing
|
||||||
This function returns the first node before @var{position} that is the
|
This function returns the first node before @var{position} in the
|
||||||
specified @var{thing}. If no such node exists, it returns @code{nil}.
|
current buffer that is the specified @var{thing}. If no such node
|
||||||
|
exists, it returns @code{nil}.
|
||||||
It's guaranteed that, if a node is returned, the node's end position is
|
It's guaranteed that, if a node is returned, the node's end position is
|
||||||
less or equal to @var{position}. In other words, this function never
|
less or equal to @var{position}. In other words, this function never
|
||||||
returns a node that encloses @var{position}.
|
returns a node that encloses @var{position}.
|
||||||
|
@ -1753,8 +1771,9 @@ function doesn't move point.
|
||||||
|
|
||||||
A positive @var{arg} means moving forward that many instances of
|
A positive @var{arg} means moving forward that many instances of
|
||||||
@var{thing}; negative @var{arg} means moving backward. If @var{side} is
|
@var{thing}; negative @var{arg} means moving backward. If @var{side} is
|
||||||
@code{beg}, this function stops at the beginning of @var{thing}; if
|
@code{beg}, this function returns the position of the beginning of
|
||||||
@code{end}, stop at the end of @var{thing}.
|
@var{thing}; if it's @code{end}, it returns the position at the end of
|
||||||
|
@var{thing}.
|
||||||
|
|
||||||
Like in @code{treesit-thing-prev}, @var{thing} can be a thing symbol
|
Like in @code{treesit-thing-prev}, @var{thing} can be a thing symbol
|
||||||
defined in @code{treesit-thing-settings}, or a predicate.
|
defined in @code{treesit-thing-settings}, or a predicate.
|
||||||
|
@ -1780,8 +1799,8 @@ less or equal to @var{position}, and it's end position is greater or equal to
|
||||||
@var{position}.
|
@var{position}.
|
||||||
|
|
||||||
If @var{strict} is non-@code{nil}, this function uses strict comparison,
|
If @var{strict} is non-@code{nil}, this function uses strict comparison,
|
||||||
i.e., start position must be strictly greater than @var{position}, and end
|
i.e., start position must be strictly smaller than @var{position}, and end
|
||||||
position must be strictly less than @var{position}.
|
position must be strictly greater than @var{position}.
|
||||||
|
|
||||||
@var{thing} can be either a thing symbol defined in
|
@var{thing} can be either a thing symbol defined in
|
||||||
@code{treesit-thing-settings}, or a predicate.
|
@code{treesit-thing-settings}, or a predicate.
|
||||||
|
|
|
@ -3237,11 +3237,14 @@ The type can be `list' (the default) or `sexp'.
|
||||||
|
|
||||||
The `list' type uses the `list' thing defined in `treesit-thing-settings'.
|
The `list' type uses the `list' thing defined in `treesit-thing-settings'.
|
||||||
See `treesit-thing-at-point'. With this type commands use syntax tables to
|
See `treesit-thing-at-point'. With this type commands use syntax tables to
|
||||||
navigate symbols and treesit definition to navigate lists.
|
navigate symbols and treesit definitions to navigate lists.
|
||||||
|
|
||||||
The `sexp' type uses the `sexp' thing defined in `treesit-thing-settings'.
|
The `sexp' type uses the `sexp' thing defined in `treesit-thing-settings'.
|
||||||
With this type commands use only the treesit definition of parser nodes,
|
With this type commands use only the treesit definitions of parser nodes,
|
||||||
without distinction between symbols and lists."
|
without distinction between symbols and lists. Since tree-sitter grammars
|
||||||
|
could group node types in arbitrary ways, navigation by `sexp' might not
|
||||||
|
match your expectations, and might produce different results in differnt
|
||||||
|
treesit-based modes."
|
||||||
(interactive "p")
|
(interactive "p")
|
||||||
(if (not (treesit-thing-defined-p 'list (treesit-language-at (point))))
|
(if (not (treesit-thing-defined-p 'list (treesit-language-at (point))))
|
||||||
(user-error "No `list' thing is defined in `treesit-thing-settings'")
|
(user-error "No `list' thing is defined in `treesit-thing-settings'")
|
||||||
|
@ -3630,14 +3633,15 @@ predicate as described in `treesit-thing-settings'."
|
||||||
(treesit--thing-sibling pos thing nil))
|
(treesit--thing-sibling pos thing nil))
|
||||||
|
|
||||||
(defun treesit-thing-at (pos thing &optional strict)
|
(defun treesit-thing-at (pos thing &optional strict)
|
||||||
"Return the smallest THING enclosing POS.
|
"Return the smallest node enclosing POS for THING.
|
||||||
|
|
||||||
The returned node, if non-nil, must enclose POS, i.e., its start
|
The returned node, if non-nil, must enclose POS, i.e., its
|
||||||
<= POS, its end > POS. If STRICT is non-nil, the returned node's
|
start <= POS, its end > POS. If STRICT is non-nil, the returned
|
||||||
start must < POS rather than <= POS.
|
node's start must be < POS rather than <= POS.
|
||||||
|
|
||||||
THING should be a thing defined in `treesit-thing-settings', or
|
THING should be a thing defined in `treesit-thing-settings' for
|
||||||
it can be a predicate described in `treesit-thing-settings'."
|
the current buffer's major mode, or it can be a predicate
|
||||||
|
described in `treesit-thing-settings'."
|
||||||
(let* ((cursor (treesit-node-at pos))
|
(let* ((cursor (treesit-node-at pos))
|
||||||
(iter-pred (lambda (node)
|
(iter-pred (lambda (node)
|
||||||
(and (treesit-node-match-p node thing t)
|
(and (treesit-node-match-p node thing t)
|
||||||
|
@ -3789,13 +3793,14 @@ function is called recursively."
|
||||||
(if (eq counter 0) pos nil)))
|
(if (eq counter 0) pos nil)))
|
||||||
|
|
||||||
(defun treesit-thing-at-point (thing tactic)
|
(defun treesit-thing-at-point (thing tactic)
|
||||||
"Return the THING at point, or nil if none is found.
|
"Return the node for THING at point, or nil if no THING is found at point.
|
||||||
|
|
||||||
THING can be a symbol, a regexp, a predicate function, and more;
|
THING can be a symbol, a regexp, a predicate function, and more;
|
||||||
see `treesit-thing-settings' for details.
|
for details, see `treesit-thing-settings' as defined by the
|
||||||
|
current buffer's major mode.
|
||||||
|
|
||||||
Return the top-level THING if TACTIC is `top-level'; return the
|
Return the top-level node for THING if TACTIC is `top-level'; return
|
||||||
smallest enclosing THING as POS if TACTIC is `nested'."
|
the smallest node enclosing THING at point if TACTIC is `nested'."
|
||||||
|
|
||||||
(let ((node (treesit-thing-at (point) thing)))
|
(let ((node (treesit-thing-at (point) thing)))
|
||||||
(if (eq tactic 'top-level)
|
(if (eq tactic 'top-level)
|
||||||
|
|
|
@ -5193,13 +5193,16 @@ then in the system default locations for dynamic libraries, in that order. */);
|
||||||
doc:
|
doc:
|
||||||
/* A list defining things.
|
/* A list defining things.
|
||||||
|
|
||||||
The value should be an alist of (LANGUAGE . DEFINITIONS), where
|
The value should be defined by the major mode, and should be an alist
|
||||||
LANGUAGE is a language symbol, and DEFINITIONS is a list of
|
of the form (LANGUAGE . DEFINITIONS), where LANGUAGE is a language
|
||||||
|
symbol and DEFINITIONS is a list whose elements are of the form
|
||||||
|
|
||||||
(THING PRED)
|
(THING PRED)
|
||||||
|
|
||||||
THING is a symbol representing the thing, like `defun', `sexp', or
|
THING is a symbol representing the thing, like `defun', `defclass',
|
||||||
`sentence'; PRED defines what kind of node can be qualified as THING.
|
`sexp', `sentence', `comment', or any other symbol that is meaningful
|
||||||
|
for the major mode; PRED defines what kind of node can be qualified
|
||||||
|
as THING.
|
||||||
|
|
||||||
PRED can be a regexp string that matches the type of the node; it can
|
PRED can be a regexp string that matches the type of the node; it can
|
||||||
be a predicate function that takes the node as the sole argument and
|
be a predicate function that takes the node as the sole argument and
|
||||||
|
@ -5207,12 +5210,13 @@ returns t if the node is the thing, and nil otherwise; it can be a
|
||||||
cons (REGEXP . FN), which is a combination of a regexp and a predicate
|
cons (REGEXP . FN), which is a combination of a regexp and a predicate
|
||||||
function, and the node has to match both to qualify as the thing.
|
function, and the node has to match both to qualify as the thing.
|
||||||
|
|
||||||
PRED can also be recursively defined. It can be (or PRED...), meaning
|
PRED can also be recursively defined. It can be:
|
||||||
satisfying anyone of the inner PREDs qualifies the node; or (and
|
|
||||||
PRED...) meaning satisfying all of the inner PREDs qualifies the node;
|
|
||||||
or (not PRED), meaning not satisfying the inner PRED qualifies the node.
|
|
||||||
|
|
||||||
There are two pre-defined predicates, `named' and `anonymous`. They
|
(or PRED...), meaning satisfying any of the inner PREDs qualifies the node;
|
||||||
|
(and PRED...) meaning satisfying all of the inner PREDs qualifies the node;
|
||||||
|
(not PRED), meaning not satisfying the inner PRED qualifies the node.
|
||||||
|
|
||||||
|
There are two pre-defined predicates, `named' and `anonymous'. They
|
||||||
match named nodes and anonymous nodes, respectively.
|
match named nodes and anonymous nodes, respectively.
|
||||||
|
|
||||||
Finally, PRED can refer to other THINGs defined in this list by using
|
Finally, PRED can refer to other THINGs defined in this list by using
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue