Improve tree-sitter docs

* doc/lispref/positions.texi (List Motion): Incorporate more
accurate description of treesit-defun-type-regexp from
'(elisp) Tree-sitter Major Modes', replacing that duplicate
entry (bug#64018).

* doc/lispref/parsing.texi (Parsing Program Source)
(Language Grammar, Using Parser, Retrieving Nodes)
(Accessing Node Information, Pattern Matching, Multiple Languages):
(Tree-sitter Major Modes):
* doc/lispref/modes.texi (Parser-based Font Lock): Improve wording,
grammar, punctuation, and markup.  Fix typos.
(Parser-based Indentation): Ditto.  Document indent rule presets
field-is, catch-all, nth-sibling, grand-parent, and
great-grand-parent.

* lisp/treesit.el (treesit-simple-indent-presets): Mention field-is,
catch-all, nth-sibling, grand-parent, great-grand-parent in
docstring.
(treesit-major-mode-setup, treesit-explore-mode): Improve
docstring/commentary grammar.
This commit is contained in:
Basil L. Contovounesios 2023-06-11 15:19:28 +01:00
parent 0e9307eb2b
commit 2847857496
4 changed files with 263 additions and 219 deletions

View file

@ -4069,7 +4069,7 @@ Source}) for this purpose.
Parser-based font lock and other font lock mechanisms are not mutually Parser-based font lock and other font lock mechanisms are not mutually
exclusive. By default, if enabled, parser-based font lock runs first, exclusive. By default, if enabled, parser-based font lock runs first,
replacing syntactic font lock, then the regexp-based font lock. replacing syntactic font lock, followed by regexp-based font lock.
Although parser-based font lock doesn't share the same customization Although parser-based font lock doesn't share the same customization
variables with regexp-based font lock, it uses similar customization variables with regexp-based font lock, it uses similar customization
@ -4102,7 +4102,7 @@ would be highlighted in @code{font-lock-keyword} face.
For more information about queries, patterns, and capture names, see For more information about queries, patterns, and capture names, see
@ref{Pattern Matching}. @ref{Pattern Matching}.
To setup tree-sitter fontification, a major mode should first set To set up tree-sitter fontification, a major mode should first set
@code{treesit-font-lock-settings} with the output of @code{treesit-font-lock-settings} with the output of
@code{treesit-font-lock-rules}, then call @code{treesit-font-lock-rules}, then call
@code{treesit-major-mode-setup}. @code{treesit-major-mode-setup}.
@ -4129,15 +4129,15 @@ example:
This function takes a series of @var{query-spec}s, where each This function takes a series of @var{query-spec}s, where each
@var{query-spec} is a @var{query} preceded by one or more @var{query-spec} is a @var{query} preceded by one or more
@var{:keyword}/@var{value} pairs. Each @var{query} is a @var{keyword}/@var{value} pairs. Each @var{query} is a tree-sitter
tree-sitter query in either the string, s-expression or compiled form. query in either the string, s-expression, or compiled form.
@c FIXME: Cross-ref treesit-font-lock-level to user manual. @c FIXME: Cross-ref treesit-font-lock-level to user manual.
For each @var{query}, the @var{:keyword}/@var{value} pairs that For each @var{query}, the @var{keyword}/@var{value} pairs that precede
precede it add meta information to it. The @code{:language} keyword it add meta information to it. The @code{:language} keyword declares
declares @var{query}'s language. The @code{:feature} keyword sets the @var{query}'s language. The @code{:feature} keyword sets the feature
feature name of @var{query}. Users can control which features are name of @var{query}. Users can control which features are enabled
enabled with @code{treesit-font-lock-level} and with @code{treesit-font-lock-level} and
@code{treesit-font-lock-feature-list} (described below). These two @code{treesit-font-lock-feature-list} (described below). These two
keywords are mandatory. keywords are mandatory.
@ -4161,11 +4161,11 @@ fontification, capture names in @var{query} should be face names like
with that face. with that face.
@findex treesit-fontify-with-override @findex treesit-fontify-with-override
Capture names can also be function names, in which case the function A capture name can also be a function name, in which case the function
is called with 4 arguments: @var{node} and @var{override}, @var{start} is called with 4 arguments: @var{node} and @var{override}, @var{start}
and @var{end}, where @var{node} is the node itself, @var{override} is and @var{end}, where @var{node} is the node itself, @var{override} is
the override property of the rule which captured this node, and the @code{:override} property of the rule which captured this node,
@var{start} and @var{end} limits the region in which this function and @var{start} and @var{end} limit the region which this function
should fontify. (If this function wants to respect the @var{override} should fontify. (If this function wants to respect the @var{override}
argument, it can use @code{treesit-fontify-with-override}.) argument, it can use @code{treesit-fontify-with-override}.)
@ -4201,9 +4201,9 @@ Some of these features warrant some explanation: @code{definition}
highlights whatever is being defined, e.g., the function name in a highlights whatever is being defined, e.g., the function name in a
function definition, the struct name in a struct definition, the function definition, the struct name in a struct definition, the
variable name in a variable definition; @code{assignment} highlights variable name in a variable definition; @code{assignment} highlights
the whatever is being assigned to, e.g., the variable or field in an whatever is being assigned to, e.g., the variable or field in an
assignment statement; @code{key} highlights keys in key-value pairs, assignment statement; @code{key} highlights keys in key-value pairs,
e.g., keys in a JSON object, or a Python dictionary; @code{doc} e.g., keys in a JSON object or Python dictionary; @code{doc}
highlights docstrings or doc-comments. highlights docstrings or doc-comments.
For example, the value of this variable could be: For example, the value of this variable could be:
@ -4977,7 +4977,7 @@ source indentation commands. For maximum flexibility, it is possible
to write a custom indentation function that queries the syntax tree to write a custom indentation function that queries the syntax tree
and indents accordingly for each language, but that is a lot of work. and indents accordingly for each language, but that is a lot of work.
It is more convenient to use the simple indentation engine described It is more convenient to use the simple indentation engine described
below: then the major mode needs only to write some indentation rules below: then the major mode needs only write some indentation rules,
and the engine takes care of the rest. and the engine takes care of the rest.
To enable the parser-based indentation engine, either set To enable the parser-based indentation engine, either set
@ -4996,10 +4996,11 @@ more complex indentation engines.
@cindex indentation rules, for parser-based indentation @cindex indentation rules, for parser-based indentation
@defvar treesit-simple-indent-rules @defvar treesit-simple-indent-rules
This local variable stores indentation rules for every language. It is This local variable stores indentation rules for every language. It
a list of the form: @w{@code{(@var{language} . @var{rules})}}, where is an alist with elements of the form @w{@code{(@var{language}
@var{language} is a language symbol, and @var{rules} is a list of the . @var{rules})}}, where @var{language} is a language symbol, and
form @w{@code{(@var{matcher} @var{anchor} @var{offset})}}. @var{rules} is a list with elements of the form
@w{@code{(@var{matcher} @var{anchor} @var{offset})}}.
First, Emacs passes the smallest tree-sitter node at the beginning of First, Emacs passes the smallest tree-sitter node at the beginning of
the current line to @var{matcher}; if it returns non-@code{nil}, this the current line to @var{matcher}; if it returns non-@code{nil}, this
@ -5033,14 +5034,14 @@ anchors.
@defvar treesit-simple-indent-presets @defvar treesit-simple-indent-presets
This is a list of defaults for @var{matcher}s and @var{anchor}s in This is a list of defaults for @var{matcher}s and @var{anchor}s in
@code{treesit-simple-indent-rules}. Each of them represents a function @code{treesit-simple-indent-rules}. Each of them represents a
that takes 3 arguments: @var{node}, @var{parent} and @var{bol}. The function that takes 3 arguments: @var{node}, @var{parent}, and
available default functions are: @var{bol}. The available default functions are:
@ftable @code @ftable @code
@item no-node @item no-node
This matcher is a function that is called with 3 arguments: This matcher is a function that is called with 3 arguments:
@var{node}, @var{parent}, and @var{bol}, and returns non-@code{nil}, @var{node}, @var{parent}, and @var{bol}. It returns non-@code{nil},
indicating a match, if @var{node} is @code{nil}, i.e., there is no indicating a match, if @var{node} is @code{nil}, i.e., there is no
node that starts at @var{bol}. This is the case when @var{bol} is on node that starts at @var{bol}. This is the case when @var{bol} is on
an empty line or inside a multi-line string, etc. an empty line or inside a multi-line string, etc.
@ -5057,6 +5058,12 @@ function that is called with 3 arguments: @var{node}, @var{parent},
and @var{bol}, and returns non-@code{nil} if @var{node}'s type matches and @var{bol}, and returns non-@code{nil} if @var{node}'s type matches
regexp @var{type}. regexp @var{type}.
@item field-is
This matcher is a function of one argument, @var{name}; it returns a
function that is called with 3 arguments: @var{node}, @var{parent},
and @var{bol}, and returns non-@code{nil} if @var{node}'s field name
in @var{parent} matches regexp @var{name}.
@item query @item query
This matcher is a function of one argument, @var{query}; it returns a This matcher is a function of one argument, @var{query}; it returns a
function that is called with 3 arguments: @var{node}, @var{parent}, function that is called with 3 arguments: @var{node}, @var{parent},
@ -5097,30 +5104,53 @@ of @var{node-type}, @var{parent-type}, and @var{grandparent-type} is
@item comment-end @item comment-end
This matcher is a function that is called with 3 arguments: This matcher is a function that is called with 3 arguments:
@var{node}, @var{parent}, and @var{bol}, and returns non-@code{nil} if @var{node}, @var{parent}, and @var{bol}, and returns non-@code{nil} if
point is before a comment ending token. Comment ending tokens are point is before a comment-ending token. Comment-ending tokens are
defined by regular expression @code{comment-end-skip} defined by regexp @code{comment-end-skip}.
@item catch-all
This matcher is a function that is called with 3 arguments:
@var{node}, @var{parent}, and @var{bol}. It always returns
non-@code{nil}, indicating a match.
@item first-sibling @item first-sibling
This anchor is a function that is called with 3 arguments: @var{node}, This anchor is a function that is called with 3 arguments: @var{node},
@var{parent}, and @var{bol}, and returns the start of the first child @var{parent}, and @var{bol}, and returns the start of the first child
of @var{parent}. of @var{parent}.
@item nth-sibling
This anchor is a function of two arguments: @var{n}, and an optional
argument @var{named}. It returns a function that is called with 3
arguments: @var{node}, @var{parent}, and @var{bol}, and returns the
start of the @var{n}th child of @var{parent}. If @var{named} is
non-@code{nil}, only named children are counted (@pxref{tree-sitter
named node, named node}).
@item parent @item parent
This anchor is a function that is called with 3 arguments: @var{node}, This anchor is a function that is called with 3 arguments: @var{node},
@var{parent}, and @var{bol}, and returns the start of @var{parent}. @var{parent}, and @var{bol}, and returns the start of @var{parent}.
@item grand-parent
This anchor is a function that is called with 3 arguments: @var{node},
@var{parent}, and @var{bol}, and returns the start of @var{parent}'s
parent.
@item great-grand-parent
This anchor is a function that is called with 3 arguments: @var{node},
@var{parent}, and @var{bol}, and returns the start of @var{parent}'s
parent's parent.
@item parent-bol @item parent-bol
This anchor is a function that is called with 3 arguments: @var{node}, This anchor is a function that is called with 3 arguments: @var{node},
@var{parent}, and @var{bol}, and returns the first non-space character @var{parent}, and @var{bol}, and returns the first non-space character
on the line which @var{parent}'s start is on. on the line which @var{parent}'s start is on.
@item parent-bol @item standalone-parent
This anchor is a function that is called with 3 arguments: @var{node}, This anchor is a function that is called with 3 arguments: @var{node},
@var{parent}, and @var{bol}. It finds the first ancestor node @var{parent}, and @var{bol}. It finds the first ancestor node
(parent, grandparent, etc) of @var{node} that starts on its own line, (parent, grandparent, etc.@:) of @var{node} that starts on its own
and return the start of that node. ``Starting on its own line'' means line, and return the start of that node. ``Starting on its own line''
there is only whitespace character before the node on the line which means there is only whitespace character before the node on the line
the node's start is on. which the node's start is on.
@item prev-sibling @item prev-sibling
This anchor is a function that is called with 3 arguments: @var{node}, This anchor is a function that is called with 3 arguments: @var{node},
@ -5150,14 +5180,14 @@ expression @code{comment-start-skip}. This function assumes
@item prev-adaptive-prefix @item prev-adaptive-prefix
This anchor is a function that is called with 3 arguments: @var{node}, This anchor is a function that is called with 3 arguments: @var{node},
@var{parent}, and @var{bol}. It tries to go to the beginning of the @var{parent}, and @var{bol}. It tries to match
previous non-empty line, and matches @code{adaptive-fill-regexp}. If @code{adaptive-fill-regexp} to the text at the beginning of the
there is a match, this function returns the end of the match, previous non-empty line. If there is a match, this function returns
otherwise it returns @code{nil}. However, if the current line begins the end of the match, otherwise it returns @code{nil}. However, if
with a prefix (e.g., ``-''), return the beginning of the prefix of the the current line begins with a prefix (e.g., @samp{-}), return the
previous line instead, so that the two prefixes align. This anchor is beginning of the prefix of the previous line instead, so that the two
useful for an @code{indent-relative}-like indent behavior for block prefixes align. This anchor is useful for an
comments. @code{indent-relative}-like indent behavior for block comments.
@end ftable @end ftable
@end defvar @end defvar
@ -5168,14 +5198,14 @@ comments.
Here are some utility functions that can help writing parser-based Here are some utility functions that can help writing parser-based
indentation rules. indentation rules.
@defun treesit-check-indent mode @deffn Command treesit-check-indent mode
This function checks the current buffer's indentation against major This command checks the current buffer's indentation against major
mode @var{mode}. It indents the current buffer according to mode @var{mode}. It indents the current buffer according to
@var{mode} and compares the results with the current indentation. @var{mode} and compares the results with the current indentation.
Then it pops up a buffer showing the differences. Correct Then it pops up a buffer showing the differences. Correct
indentation (target) is shown in green color, current indentation is indentation (target) is shown in green color, current indentation is
shown in red color. @c Are colors customizable? faces? shown in red color. @c Are colors customizable? faces?
@end defun @end deffn
It is also helpful to use @code{treesit-inspect-mode} (@pxref{Language It is also helpful to use @code{treesit-inspect-mode} (@pxref{Language
Grammar}) when writing indentation rules. Grammar}) when writing indentation rules.

View file

@ -9,7 +9,7 @@
Emacs provides various ways to parse program source text and produce a Emacs provides various ways to parse program source text and produce a
@dfn{syntax tree}. In a syntax tree, text is no longer considered a @dfn{syntax tree}. In a syntax tree, text is no longer considered a
one-dimensional stream of characters, but a structured tree of nodes, one-dimensional stream of characters, but a structured tree of nodes,
where each node representing a piece of text. Thus, a syntax tree can where each node represents a piece of text. Thus, a syntax tree can
enable interesting features like precise fontification, indentation, enable interesting features like precise fontification, indentation,
navigation, structured editing, etc. navigation, structured editing, etc.
@ -19,8 +19,8 @@ generic navigation and indentation (@pxref{SMIE}).
In addition to those, Emacs also provides integration with In addition to those, Emacs also provides integration with
@uref{https://tree-sitter.github.io/tree-sitter, the tree-sitter @uref{https://tree-sitter.github.io/tree-sitter, the tree-sitter
library}) if support for it was compiled in. The tree-sitter library library} if support for it was compiled in. The tree-sitter library
implements an incremental parser and has support from a wide range of implements an incremental parser and has support for a wide range of
programming languages. programming languages.
@defun treesit-available-p @defun treesit-available-p
@ -65,10 +65,10 @@ For example, the C language grammar is represented as the symbol
@vindex treesit-extra-load-path @vindex treesit-extra-load-path
@vindex treesit-load-language-error @vindex treesit-load-language-error
Tree-sitter language grammar are distributed as dynamic libraries. Tree-sitter language grammars are distributed as dynamic libraries.
In order to use a language grammar in Emacs, you need to make sure In order to use a language grammar in Emacs, you need to make sure
that the dynamic library is installed on the system. Emacs looks for that the dynamic library is installed on the system. Emacs looks for
language grammar in several places, in the following order: language grammars in several places, in the following order:
@itemize @bullet @itemize @bullet
@item @item
@ -95,8 +95,8 @@ This means that Emacs could not find the language grammar library.
This means that Emacs could not find in the library the expected function This means that Emacs could not find in the library the expected function
that every language grammar library should export. that every language grammar library should export.
@item (version-mismatch @var{error-msg}) @item (version-mismatch @var{error-msg})
This means that the version of language grammar library is incompatible This means that the version of the language grammar library is
with that of the tree-sitter library. incompatible with that of the tree-sitter library.
@end table @end table
@noindent @noindent
@ -105,7 +105,7 @@ details about the failure.
@defun treesit-language-available-p language &optional detail @defun treesit-language-available-p language &optional detail
This function returns non-@code{nil} if the language grammar for This function returns non-@code{nil} if the language grammar for
@var{language} exist and can be loaded. @var{language} exists and can be loaded.
If @var{detail} is non-@code{nil}, return @code{(t . nil)} when If @var{detail} is non-@code{nil}, return @code{(t . nil)} when
@var{language} is available, and @code{(nil . @var{data})} when it's @var{language} is available, and @code{(nil . @var{data})} when it's
@ -126,7 +126,7 @@ doesn't follow this convention, you should add an entry
@end example @end example
to the list in the variable @code{treesit-load-name-override-list}, where to the list in the variable @code{treesit-load-name-override-list}, where
@var{library-base-name} is the basename of the dynamic library's file name, @var{library-base-name} is the basename of the dynamic library's file name
(usually, @file{libtree-sitter-@var{language}}), and (usually, @file{libtree-sitter-@var{language}}), and
@var{function-name} is the function provided by the library @var{function-name} is the function provided by the library
(usually, @code{tree_sitter_@var{language}}). For example, (usually, @code{tree_sitter_@var{language}}). For example,
@ -146,7 +146,7 @@ Application Binary Interface (@acronym{ABI}) supported by the
tree-sitter library. By default, it returns the latest ABI version tree-sitter library. By default, it returns the latest ABI version
supported by the library, but if @var{min-compatible} is supported by the library, but if @var{min-compatible} is
non-@code{nil}, it returns the oldest ABI version which the library non-@code{nil}, it returns the oldest ABI version which the library
still can support. language grammar libraries must be built for still can support. Language grammar libraries must be built for
ABI versions between the oldest and the latest versions supported by ABI versions between the oldest and the latest versions supported by
the tree-sitter library, otherwise the library will be unable to load the tree-sitter library, otherwise the library will be unable to load
them. them.
@ -232,11 +232,11 @@ assign @dfn{field names} to child nodes. For example, a
@cindex explore tree-sitter syntax tree @cindex explore tree-sitter syntax tree
@cindex inspection of tree-sitter parse tree nodes @cindex inspection of tree-sitter parse tree nodes
To aid in understanding the syntax of a language and in debugging of To aid in understanding the syntax of a language and in debugging Lisp
Lisp program that use the syntax tree, Emacs provides an ``explore'' programs that use the syntax tree, Emacs provides an ``explore'' mode,
mode, which displays the syntax tree of the source in the current which displays the syntax tree of the source in the current buffer in
buffer in real time. Emacs also comes with an ``inspect mode'', which real time. Emacs also comes with an ``inspect mode'', which displays
displays information of the nodes at point in the mode-line. information of the nodes at point in the mode-line.
@deffn Command treesit-explore-mode @deffn Command treesit-explore-mode
This mode pops up a window displaying the syntax tree of the source in This mode pops up a window displaying the syntax tree of the source in
@ -271,7 +271,7 @@ parser in @code{(treesit-parser-list)} (@pxref{Using Parser}).
@heading Reading the grammar definition @heading Reading the grammar definition
@cindex reading grammar definition, tree-sitter @cindex reading grammar definition, tree-sitter
Authors of language grammar define the @dfn{grammar} of a Authors of language grammars define the @dfn{grammar} of a
programming language, which determines how a parser constructs a programming language, which determines how a parser constructs a
concrete syntax tree out of the program text. In order to use the concrete syntax tree out of the program text. In order to use the
syntax tree effectively, you need to consult the @dfn{grammar file}. syntax tree effectively, you need to consult the @dfn{grammar file}.
@ -283,7 +283,7 @@ home page can be found on
homepage}. homepage}.
The grammar definition is written in JavaScript. For example, the The grammar definition is written in JavaScript. For example, the
rule matching a @code{function_definition} node looks like rule matching a @code{function_definition} node may look like
@example @example
@group @group
@ -331,13 +331,13 @@ matches each rule one after another.
@item choice(@var{rule1}, @var{rule2}, @dots{}) @item choice(@var{rule1}, @var{rule2}, @dots{})
matches one of the rules in its arguments. matches one of the rules in its arguments.
@item repeat(@var{rule}) @item repeat(@var{rule})
matches @var{rule} for @emph{zero or more} times. matches @var{rule} @emph{zero or more} times.
This is like the @samp{*} operator in regular expressions. This is like the @samp{*} operator in regular expressions.
@item repeat1(@var{rule}) @item repeat1(@var{rule})
matches @var{rule} for @emph{one or more} times. matches @var{rule} @emph{one or more} times.
This is like the @samp{+} operator in regular expressions. This is like the @samp{+} operator in regular expressions.
@item optional(@var{rule}) @item optional(@var{rule})
matches @var{rule} for @emph{zero or one} time. matches @var{rule} @emph{zero or one} times.
This is like the @samp{?} operator in regular expressions. This is like the @samp{?} operator in regular expressions.
@item field(@var{name}, @var{rule}) @item field(@var{name}, @var{rule})
assigns field name @var{name} to the child node matched by @var{rule}. assigns field name @var{name} to the child node matched by @var{rule}.
@ -366,7 +366,7 @@ Nodes}.
@item token.immediate(@var{rule}) @item token.immediate(@var{rule})
Normally, grammar rules ignore preceding whitespace; this Normally, grammar rules ignore preceding whitespace; this
changes @var{rule} to match only when there is no preceding changes @var{rule} to match only when there is no preceding
whitespaces. whitespace.
@item prec(@var{n}, @var{rule}) @item prec(@var{n}, @var{rule})
gives @var{rule} the level-@var{n} precedence. gives @var{rule} the level-@var{n} precedence.
@item prec.left([@var{n},] @var{rule}) @item prec.left([@var{n},] @var{rule})
@ -412,7 +412,7 @@ non-@code{nil}, this function always creates a new parser.
If that buffer is an indirect buffer, its base buffer is used instead. If that buffer is an indirect buffer, its base buffer is used instead.
That is, indirect buffers use their base buffer's parsers. If the That is, indirect buffers use their base buffer's parsers. If the
base buffer is narrowed, an indirect buffer might not be able to base buffer is narrowed, an indirect buffer might not be able to
retrieve information of the portion of the buffer text that are retrieve information of the portion of the buffer text that is
invisible in the base buffer. Lisp programs should widen as necessary invisible in the base buffer. Lisp programs should widen as necessary
should they want to use a parser in an indirect buffer. should they want to use a parser in an indirect buffer.
@end defun @end defun
@ -441,7 +441,7 @@ change is made in the buffer, a parser doesn't re-parse immediately.
@vindex treesit-buffer-too-large @vindex treesit-buffer-too-large
When a parser does parse, it checks for the size of the buffer. When a parser does parse, it checks for the size of the buffer.
Tree-sitter can only handle buffer no larger than about 4GB. If the Tree-sitter can only handle buffers no larger than about 4GB@. If the
size exceeds that, Emacs signals the @code{treesit-buffer-too-large} size exceeds that, Emacs signals the @code{treesit-buffer-too-large}
error with signal data being the buffer size. error with signal data being the buffer size.
@ -500,13 +500,12 @@ converts text before that token into a comment. Even
though the text is not directly edited, it is deemed to be ``changed'' though the text is not directly edited, it is deemed to be ``changed''
nevertheless. nevertheless.
Emacs lets a Lisp program to register callback functions Emacs lets a Lisp program register callback functions (a.k.a.@:
(a.k.a.@: @dfn{notifiers}) for this kind of changes. A notifier @dfn{notifiers}) for these kinds of changes. A notifier function
function takes two arguments: @var{ranges} and @var{parser}. takes two arguments: @var{ranges} and @var{parser}. @var{ranges} is a
@var{ranges} is a list of cons cells of the form @w{@code{(@var{start} list of cons cells of the form @w{@code{(@var{start} . @var{end})}},
. @var{end})}}, where @var{start} and @var{end} mark the start and the where @var{start} and @var{end} mark the start and the end positions
end positions of a range. @var{parser} is the parser issuing the of a range. @var{parser} is the parser issuing the notification.
notification.
Every time a parser reparses a buffer, it compares the old and new Every time a parser reparses a buffer, it compares the old and new
parse-tree, computes the ranges in which nodes have changed, and parse-tree, computes the ranges in which nodes have changed, and
@ -537,7 +536,7 @@ This function returns the list of @var{parser}'s notifier functions.
@cindex get node, tree-sitter @cindex get node, tree-sitter
@cindex terminology, for tree-sitter functions @cindex terminology, for tree-sitter functions
Here's some terminology and conventions we use when documenting Here are some terms and conventions we use when documenting
tree-sitter functions. tree-sitter functions.
A node in a syntax tree spans some portion of the program text in the A node in a syntax tree spans some portion of the program text in the
@ -571,8 +570,8 @@ This function returns a @dfn{leaf} node at buffer position @var{pos}.
A leaf node is a node that doesn't have any child nodes. A leaf node is a node that doesn't have any child nodes.
This function tries to return a node whose span covers @var{pos}: the This function tries to return a node whose span covers @var{pos}: the
node's beginning position is less or equal to @var{pos}, and the node's beginning position is less than or equal to @var{pos}, and the
node's end position is greater or equal to @var{pos}. node's end position is greater than or equal to @var{pos}.
If no leaf node's span covers @var{pos} (e.g., @var{pos} is in the If no leaf node's span covers @var{pos} (e.g., @var{pos} is in the
whitespace between two leaf nodes), this function returns the first whitespace between two leaf nodes), this function returns the first
@ -612,7 +611,7 @@ start of the node is before or at @var{beg}, and the end of the node
is at or after @var{end}. is at or after @var{end}.
@emph{Beware:} calling this function on an empty line that is not @emph{Beware:} calling this function on an empty line that is not
inside any top-level construct (function definition, etc.) most inside any top-level construct (function definition, etc.@:) most
probably will give you the root node, because the root node is the probably will give you the root node, because the root node is the
smallest node that covers that empty line. Most of the time, you want smallest node that covers that empty line. Most of the time, you want
to use @code{treesit-node-at} instead. to use @code{treesit-node-at} instead.
@ -672,7 +671,7 @@ first child is the opening quote @code{"}, and the first named child
is the string text. is the string text.
This function returns @code{nil} if there is no @var{n}'th child. This function returns @code{nil} if there is no @var{n}'th child.
@var{n} could be negative, e.g., @code{-1} represents the last child. @var{n} could be negative, e.g., @minus{}1 represents the last child.
@end defun @end defun
@defun treesit-node-children node &optional named @defun treesit-node-children node &optional named
@ -694,7 +693,7 @@ This function finds the previous sibling of @var{node}. If
@cindex nodes, by field name @cindex nodes, by field name
@cindex syntax tree nodes, by field name @cindex syntax tree nodes, by field name
To make the syntax tree easier to analyze, many language grammar To make the syntax tree easier to analyze, many language grammars
assign @dfn{field names} to child nodes (@pxref{tree-sitter node field assign @dfn{field names} to child nodes (@pxref{tree-sitter node field
name, field name}). For example, a @code{function_definition} node name, field name}). For example, a @code{function_definition} node
could have a @code{declarator} node and a @code{body} node. could have a @code{declarator} node and a @code{body} node.
@ -729,7 +728,7 @@ first named child (@pxref{tree-sitter named node, named node}).
This function finds the @emph{smallest} descendant node of @var{node} This function finds the @emph{smallest} descendant node of @var{node}
that spans the region of text between positions @var{beg} and that spans the region of text between positions @var{beg} and
@var{end}. It is similar to @code{treesit-node-at}. If @var{named} @var{end}. It is similar to @code{treesit-node-at}. If @var{named}
is non-@code{nil}, it looks for smallest named child. is non-@code{nil}, it looks for the smallest named child.
@end defun @end defun
@heading Searching for node @heading Searching for node
@ -755,8 +754,8 @@ defaults to 1000.
Like @code{treesit-search-subtree}, this function also traverses the Like @code{treesit-search-subtree}, this function also traverses the
parse tree and matches each node with @var{predicate} (except for parse tree and matches each node with @var{predicate} (except for
@var{start}), where @var{predicate} can be a regexp or a function. @var{start}), where @var{predicate} can be a regexp or a function.
For a tree like the below where @var{start} is marked S, this function For a tree like the one below where @var{start} is marked @samp{S},
traverses as numbered from 1 to 12: this function traverses as numbered from 1 to 12:
@example @example
@group @group
@ -773,7 +772,7 @@ o o +-+-+ +--+--+
@end example @end example
Note that this function doesn't traverse the subtree of @var{start}, Note that this function doesn't traverse the subtree of @var{start},
and it always traverse leaf nodes first, then upwards. and it always traverses leaf nodes first, before moving upwards.
Like @code{treesit-search-subtree}, this function only searches for Like @code{treesit-search-subtree}, this function only searches for
named nodes by default, but if @var{all} is non-@code{nil}, it named nodes by default, but if @var{all} is non-@code{nil}, it
@ -786,10 +785,10 @@ that comes after it in the buffer position order, i.e., nodes with
start positions greater than the end position of @var{start}. start positions greater than the end position of @var{start}.
In the tree shown above, @code{treesit-search-subtree} traverses node In the tree shown above, @code{treesit-search-subtree} traverses node
S (@var{start}) and nodes marked with @code{o}, where this function @samp{S} (@var{start}) and nodes marked with @code{o}, where this
traverses the nodes marked with numbers. This function is useful for function traverses the nodes marked with numbers. This function is
answering questions like ``what is the first node after @var{start} in useful for answering questions like ``what is the first node after
the buffer that satisfies some condition?'' @var{start} in the buffer that satisfies some condition?''
@end defun @end defun
@defun treesit-search-forward-goto node predicate &optional start backward all @defun treesit-search-forward-goto node predicate &optional start backward all
@ -801,7 +800,7 @@ This function guarantees that the matched node it returns makes
progress in terms of buffer position: the start/end position of the progress in terms of buffer position: the start/end position of the
returned node is always greater than that of @var{node}. returned node is always greater than that of @var{node}.
Arguments @var{predicate}, @var{backward} and @var{all} are the same Arguments @var{predicate}, @var{backward}, and @var{all} are the same
as in @code{treesit-search-forward}. as in @code{treesit-search-forward}.
@end defun @end defun
@ -811,12 +810,12 @@ This function creates a sparse tree from @var{root}'s subtree.
It takes the subtree under @var{root}, and combs it so only the nodes It takes the subtree under @var{root}, and combs it so only the nodes
that match @var{predicate} are left. Like previous functions, the that match @var{predicate} are left. Like previous functions, the
@var{predicate} can be a regexp string that matches against each @var{predicate} can be a regexp string that matches against each
node's type, or a function that takes a node and return non-@code{nil} node's type, or a function that takes a node and returns
if it matches. non-@code{nil} if it matches.
For example, for a subtree on the left that consist of both numbers For example, given the subtree on the left that consists of both
and letters, if @var{predicate} is ``letter only'', the returned tree numbers and letters, if @var{predicate} is ``letter only'', the
is the one on the right. returned tree is the one on the right.
@example @example
@group @group
@ -836,9 +835,9 @@ b 1 2 b | | b c d
If @var{process-fn} is non-@code{nil}, instead of returning the If @var{process-fn} is non-@code{nil}, instead of returning the
matched nodes, this function passes each node to @var{process-fn} and matched nodes, this function passes each node to @var{process-fn} and
uses the returned value instead. If non-@code{nil}, @var{depth} is uses the returned value instead. If non-@code{nil}, @var{depth}
the number of levels to go down from @var{root}. If @var{depth} is limits the number of levels to go down from @var{root}. If
@code{nil}, it defaults to 1000. @var{depth} is @code{nil}, it defaults to 1000.
Each node in the returned tree looks like Each node in the returned tree looks like
@w{@code{(@var{tree-sitter-node} . (@var{child} @dots{}))}}. The @w{@code{(@var{tree-sitter-node} . (@var{child} @dots{}))}}. The
@ -853,17 +852,17 @@ Each node in the returned tree looks like
This function finds immediate children of @var{node} that satisfy This function finds immediate children of @var{node} that satisfy
@var{predicate}. @var{predicate}.
The @var{predicate} function takes a node as the argument and should The @var{predicate} function takes a node as argument and should
return non-@code{nil} to indicate that the node should be kept. If return non-@code{nil} to indicate that the node should be kept. If
@var{named} is non-@code{nil}, this function only examines the named @var{named} is non-@code{nil}, this function only examines named
nodes. nodes.
@end defun @end defun
@defun treesit-parent-until node predicate &optional include-node @defun treesit-parent-until node predicate &optional include-node
This function repeatedly finds the parents of @var{node}, and returns This function repeatedly finds the parents of @var{node}, and returns
the parent that satisfies @var{pred}, a function that takes a node as the parent that satisfies @var{pred}, a function that takes a node as
the argument and returns a boolean that indicates a match. If no argument and returns a boolean that indicates a match. If no parent
parent satisfies @var{pred}, this function returns @code{nil}. satisfies @var{pred}, this function returns @code{nil}.
Normally this function only looks at the parents of @var{node} but not Normally this function only looks at the parents of @var{node} but not
@var{node} itself. But if @var{include-node} is non-@code{nil}, this @var{node} itself. But if @var{include-node} is non-@code{nil}, this
@ -873,10 +872,10 @@ function returns @var{node} if @var{node} satisfies @var{pred}.
@defun treesit-parent-while node pred @defun treesit-parent-while node pred
This function goes up the tree starting from @var{node}, and keeps This function goes up the tree starting from @var{node}, and keeps
doing so as long as the nodes satisfy @var{pred}, a function that doing so as long as the nodes satisfy @var{pred}, a function that
takes a node as the argument. That is, this function returns the takes a node as argument. That is, this function returns the highest
highest parent of @var{node} that still satisfies @var{pred}. Note parent of @var{node} that still satisfies @var{pred}. Note that if
that if @var{node} satisfies @var{pred} but its immediate parent @var{node} satisfies @var{pred} but its immediate parent doesn't,
doesn't, @var{node} itself is returned. @var{node} itself is returned.
@end defun @end defun
@defun treesit-node-top-level node &optional type @defun treesit-node-top-level node &optional type
@ -979,7 +978,7 @@ has an error.
@cindex tree-sitter, live parsing node @cindex tree-sitter, live parsing node
@cindex live node, tree-sitter @cindex live node, tree-sitter
A node is considered @dfn{live} if its parser is not deleted, and the A node is considered @dfn{live} if its parser is not deleted, and the
buffer to which it belongs to is a live buffer (@pxref{Killing Buffers}). buffer to which it belongs is a live buffer (@pxref{Killing Buffers}).
@defun treesit-node-check node property @defun treesit-node-check node property
This function returns non-@code{nil} if @var{node} has the specified This function returns non-@code{nil} if @var{node} has the specified
@ -1016,12 +1015,12 @@ This function returns the field name of the @var{n}'th child of
@var{node}. It returns @code{nil} if there is no @var{n}'th child, or @var{node}. It returns @code{nil} if there is no @var{n}'th child, or
the @var{n}'th child doesn't have a field name. the @var{n}'th child doesn't have a field name.
Note that @var{n} counts both named and anonymous child. And @var{n} Note that @var{n} counts both named and anonymous children, and
could be negative, e.g., @code{-1} represents the last child. @var{n} can be negative, e.g., @minus{}1 represents the last child.
@end defun @end defun
@defun treesit-node-child-count node &optional named @defun treesit-node-child-count node &optional named
This function finds the number of children of @var{node}. If This function returns the number of children of @var{node}. If
@var{named} is non-@code{nil}, it only counts named children @var{named} is non-@code{nil}, it only counts named children
(@pxref{tree-sitter named node, named node}). (@pxref{tree-sitter named node, named node}).
@end defun @end defun
@ -1048,7 +1047,7 @@ finally the more advanced pattern syntax.
@cindex query, tree-sitter @cindex query, tree-sitter
A @dfn{query} consists of multiple @dfn{patterns}. Each pattern is an A @dfn{query} consists of multiple @dfn{patterns}. Each pattern is an
s-expression that matches a certain node in the syntax node. A s-expression that matches a certain node in the syntax node. A
pattern has the form @w{@code{(@var{type} (@var{child}@dots{}))}} pattern has the form @w{@code{(@var{type} (@var{child}@dots{}))}}.
For example, a pattern that matches a @code{binary_expression} node that For example, a pattern that matches a @code{binary_expression} node that
contains @code{number_literal} child nodes would look like contains @code{number_literal} child nodes would look like
@ -1084,25 +1083,26 @@ example, the capture name @code{biexp}:
Now we can introduce the @dfn{query functions}. Now we can introduce the @dfn{query functions}.
@defun treesit-query-capture node query &optional beg end node-only @defun treesit-query-capture node query &optional beg end node-only
This function matches patterns in @var{query} within @var{node}. This function matches patterns in @var{query} within @var{node}. The
The argument @var{query} can be either a string, a s-expression, or a argument @var{query} can be either a string, an s-expression, or a
compiled query object. For now, we focus on the string syntax; compiled query object. For now, we focus on the string syntax;
s-expression syntax and compiled query are described at the end of the s-expression syntax and compiled queries are described at the end of
section. the section.
The argument @var{node} can also be a parser or a language symbol. A The argument @var{node} can also be a parser or a language symbol. A
parser means using its root node, a language symbol means find or parser means use its root node, a language symbol means find or create
create a parser for that language in the current buffer, and use the a parser for that language in the current buffer, and use the root
root node. node.
The function returns all the captured nodes in a list of the form The function returns all the captured nodes in an alist with elements
@w{@code{(@var{capture_name} . @var{node})}}. If @var{node-only} is of the form @w{@code{(@var{capture_name} . @var{node})}}. If
non-@code{nil}, it returns the list of nodes instead. By default the @var{node-only} is non-@code{nil}, it returns the list of @var{node}s
entire text of @var{node} is searched, but if @var{beg} and @var{end} instead. By default the entire text of @var{node} is searched, but if
are both non-@code{nil}, they specify the region of buffer text where @var{beg} and @var{end} are both non-@code{nil}, they specify the
this function should match nodes. Any matching node whose span region of buffer text where this function should match nodes. Any
overlaps with the region between @var{beg} and @var{end} are captured, matching node whose span overlaps with the region between @var{beg}
it doesn't have to be completely in the region. and @var{end} is captured; it doesn't have to be completely contained
in the region.
@vindex treesit-query-error @vindex treesit-query-error
@findex treesit-query-validate @findex treesit-query-validate
@ -1146,13 +1146,13 @@ For example, it could have two top-level patterns:
@end example @end example
@defun treesit-query-string string query language @defun treesit-query-string string query language
This function parses @var{string} with @var{language}, matches its This function parses @var{string} as @var{language}, matches its root
root node with @var{query}, and returns the result. node with @var{query}, and returns the result.
@end defun @end defun
@heading More query syntax @heading More query syntax
Besides node type and capture, tree-sitter's pattern syntax can Besides node type and capture name, tree-sitter's pattern syntax can
express anonymous node, field name, wildcard, quantification, express anonymous node, field name, wildcard, quantification,
grouping, alternation, anchor, and predicate. grouping, alternation, anchor, and predicate.
@ -1168,11 +1168,11 @@ pattern matching (and capturing) keyword @code{return} would be
@subheading Wild card @subheading Wild card
In a pattern, @samp{(_)} matches any named node, and @samp{_} matches In a pattern, @samp{(_)} matches any named node, and @samp{_} matches
any named and anonymous node. For example, to capture any named child any named or anonymous node. For example, to capture any named child
of a @code{binary_expression} node, the pattern would be of a @code{binary_expression} node, the pattern would be
@example @example
(binary_expression (_) @@in_biexp) (binary_expression (_) @@in-biexp)
@end example @end example
@subheading Field name @subheading Field name
@ -1190,7 +1190,7 @@ names, indicated by the colon following them.
@end example @end example
It is also possible to capture a node that doesn't have a certain It is also possible to capture a node that doesn't have a certain
field, say, a @code{function_definition} without a @code{body} field. field, say, a @code{function_definition} without a @code{body} field:
@example @example
(function_definition !body) @@func-no-body (function_definition !body) @@func-no-body
@ -1199,20 +1199,20 @@ field, say, a @code{function_definition} without a @code{body} field.
@subheading Quantify node @subheading Quantify node
@cindex quantify node, tree-sitter @cindex quantify node, tree-sitter
Tree-sitter recognizes quantification operators @samp{*}, @samp{+} and Tree-sitter recognizes quantification operators @samp{*}, @samp{+},
@samp{?}. Their meanings are the same as in regular expressions: and @samp{?}. Their meanings are the same as in regular expressions:
@samp{*} matches the preceding pattern zero or more times, @samp{+} @samp{*} matches the preceding pattern zero or more times, @samp{+}
matches one or more times, and @samp{?} matches zero or one time. matches one or more times, and @samp{?} matches zero or one times.
For example, the following pattern matches @code{type_declaration} For example, the following pattern matches @code{type_declaration}
nodes that has @emph{zero or more} @code{long} keyword. nodes that have @emph{zero or more} @code{long} keywords.
@example @example
(type_declaration "long"*) @@long-type (type_declaration "long"*) @@long-type
@end example @end example
The following pattern matches a type declaration that has zero or one The following pattern matches a type declaration that may or may not
@code{long} keyword: have a @code{long} keyword:
@example @example
(type_declaration "long"?) @@long-type (type_declaration "long"?) @@long-type
@ -1220,9 +1220,9 @@ The following pattern matches a type declaration that has zero or one
@subheading Grouping @subheading Grouping
Similar to groups in regular expression, we can bundle patterns into Similar to groups in regular expressions, we can bundle patterns into
groups and apply quantification operators to them. For example, to groups and apply quantification operators to them. For example, to
express a comma separated list of identifiers, one could write express a comma-separated list of identifiers, one could write
@example @example
(identifier) ("," (identifier))* (identifier) ("," (identifier))*
@ -1230,10 +1230,10 @@ express a comma separated list of identifiers, one could write
@subheading Alternation @subheading Alternation
Again, similar to regular expressions, we can express ``match anyone Again, similar to regular expressions, we can express ``match any one
from this group of patterns'' in a pattern. The syntax is a list of of these patterns'' in a pattern. The syntax is a list of patterns
patterns enclosed in square brackets. For example, to capture some enclosed in square brackets. For example, to capture some keywords in
keywords in C, the pattern would be C, the pattern would be
@example @example
@group @group
@ -1292,14 +1292,14 @@ example, with the following pattern:
@end example @end example
@noindent @noindent
tree-sitter only matches arrays where the first element equals to the tree-sitter only matches arrays where the first element is equal to
last element. To attach a predicate to a pattern, we need to group the last element. To attach a predicate to a pattern, we need to
them together. A predicate always starts with a @samp{#}. Currently group them together. A predicate always starts with a @samp{#}.
there are three predicates, @code{#equal}, @code{#match}, and Currently there are three predicates: @code{#equal}, @code{#match},
@code{#pred}. and @code{#pred}.
@deffn Predicate equal arg1 arg2 @deffn Predicate equal arg1 arg2
Matches if @var{arg1} equals to @var{arg2}. Arguments can be either Matches if @var{arg1} is equal to @var{arg2}. Arguments can be either
strings or capture names. Capture names represent the text that the strings or capture names. Capture names represent the text that the
captured node spans in the buffer. captured node spans in the buffer.
@end deffn @end deffn
@ -1322,7 +1322,7 @@ names in other patterns.
@cindex tree-sitter patterns as sexps @cindex tree-sitter patterns as sexps
@cindex patterns, tree-sitter, in sexp form @cindex patterns, tree-sitter, in sexp form
Besides strings, Emacs provides a s-expression based syntax for Besides strings, Emacs provides an s-expression based syntax for
tree-sitter patterns. It largely resembles the string-based syntax. tree-sitter patterns. It largely resembles the string-based syntax.
For example, the following query For example, the following query
@ -1354,7 +1354,7 @@ is equivalent to
@end example @end example
Most patterns can be written directly as strange but nevertheless Most patterns can be written directly as strange but nevertheless
valid s-expressions. Only a few of them needs modification: valid s-expressions. Only a few of them need modification:
@itemize @itemize
@item @item
@ -1382,7 +1382,7 @@ For example,
@end example @end example
@noindent @noindent
is written in s-expression as is written in s-expression syntax as
@example @example
@group @group
@ -1440,8 +1440,8 @@ example. In that case, text segments written in different languages
need to be assigned different parsers. Traditionally, this is need to be assigned different parsers. Traditionally, this is
achieved by using narrowing. While tree-sitter works with narrowing achieved by using narrowing. While tree-sitter works with narrowing
(@pxref{tree-sitter narrowing, narrowing}), the recommended way is (@pxref{tree-sitter narrowing, narrowing}), the recommended way is
instead to set regions of buffer text (i.e., ranges) in which a parser instead to specify regions of buffer text (i.e., ranges) in which a
will operate. This section describes functions for setting and parser will operate. This section describes functions for setting and
getting ranges for a parser. getting ranges for a parser.
Lisp programs should call @code{treesit-update-ranges} to make sure Lisp programs should call @code{treesit-update-ranges} to make sure
@ -1459,7 +1459,7 @@ end of the section.
@defun treesit-parser-set-included-ranges parser ranges @defun treesit-parser-set-included-ranges parser ranges
This function sets up @var{parser} to operate on @var{ranges}. The This function sets up @var{parser} to operate on @var{ranges}. The
@var{parser} will only read the text of the specified ranges. Each @var{parser} will only read the text of the specified ranges. Each
range in @var{ranges} is a list of the form @w{@code{(@var{beg} range in @var{ranges} is a pair of the form @w{@code{(@var{beg}
. @var{end})}}. . @var{end})}}.
The ranges in @var{ranges} must come in order and must not overlap. The ranges in @var{ranges} must come in order and must not overlap.
@ -1533,7 +1533,7 @@ Like other query functions, this function raises the
@heading Supporting multiple languages in Lisp programs @heading Supporting multiple languages in Lisp programs
It should suffice for general Lisp programs to call the following two It should suffice for general Lisp programs to call the following two
functions in order to support program sources that mixes multiple functions in order to support program sources that mix multiple
languages. languages.
@defun treesit-update-ranges &optional beg end @defun treesit-update-ranges &optional beg end
@ -1569,13 +1569,13 @@ language's parser, retrieves some information, sets ranges for the
embedded languages with that information, and then parses the embedded embedded languages with that information, and then parses the embedded
languages. languages.
Take a buffer containing @acronym{HTML}, @acronym{CSS} and JavaScript Take a buffer containing @acronym{HTML}, @acronym{CSS}, and JavaScript
as an example. A Lisp program will first parse the whole buffer with as an example. A Lisp program will first parse the whole buffer with
an @acronym{HTML} parser, then query the parser for an @acronym{HTML} parser, then query the parser for
@code{style_element} and @code{script_element} nodes, which @code{style_element} and @code{script_element} nodes, which correspond
correspond to @acronym{CSS} and JavaScript text, respectively. Then to @acronym{CSS} and JavaScript text, respectively. Then it sets the
it sets the range of the @acronym{CSS} and JavaScript parser to the range of the @acronym{CSS} and JavaScript parsers to the range which
ranges in which their corresponding nodes span. their corresponding nodes span.
Given a simple @acronym{HTML} document: Given a simple @acronym{HTML} document:
@ -1629,17 +1629,17 @@ directly translate into operations shown above.
@example @example
@group @group
(setq-local treesit-range-settings (setq treesit-range-settings
(treesit-range-rules (treesit-range-rules
:embed 'javascript :embed 'javascript
:host 'html :host 'html
'((script_element (raw_text) @@capture)) '((script_element (raw_text) @@capture))
@end group @end group
@group @group
:embed 'css :embed 'css
:host 'html :host 'html
'((style_element (raw_text) @@capture)))) '((style_element (raw_text) @@capture))))
@end group @end group
@end example @end example
@ -1650,21 +1650,21 @@ value that @code{treesit-range-settings} can have.
It takes a series of @var{query-spec}s, where each @var{query-spec} is It takes a series of @var{query-spec}s, where each @var{query-spec} is
a @var{query} preceded by zero or more @var{keyword}/@var{value} a @var{query} preceded by zero or more @var{keyword}/@var{value}
pairs. Each @var{query} is a tree-sitter query in either the pairs. Each @var{query} is a tree-sitter query in either the string,
string, s-expression or compiled form, or a function. s-expression, or compiled form, or a function.
If @var{query} is a tree-sitter query, it should be preceded by two If @var{query} is a tree-sitter query, it should be preceded by two
@var{:keyword}/@var{value} pairs, where the @code{:embed} keyword @var{keyword}/@var{value} pairs, where the @code{:embed} keyword
specifies the embedded language, and the @code{:host} keyword specifies the embedded language, and the @code{:host} keyword
specified the host language. specifies the host language.
@code{treesit-update-ranges} uses @var{query} to figure out how to set @code{treesit-update-ranges} uses @var{query} to figure out how to set
the ranges for parsers for the embedded language. It queries the ranges for parsers for the embedded language. It queries
@var{query} in a host language parser, computes the ranges in which @var{query} in a host language parser, computes the ranges which the
the captured nodes span, and applies these ranges to embedded captured nodes span, and applies these ranges to embedded language
language parsers. parsers.
If @var{query} is a function, it doesn't need any @var{:keyword} and If @var{query} is a function, it doesn't need any @var{keyword} and
@var{value} pair. It should be a function that takes 2 arguments, @var{value} pair. It should be a function that takes 2 arguments,
@var{start} and @var{end}, and sets the ranges for parsers in the @var{start} and @var{end}, and sets the ranges for parsers in the
current buffer in the region between @var{start} and @var{end}. It is current buffer in the region between @var{start} and @var{end}. It is
@ -1717,8 +1717,8 @@ this pattern:
@code{treesit-ready-p} automatically emits a warning if conditions for @code{treesit-ready-p} automatically emits a warning if conditions for
enabling tree-sitter aren't met. enabling tree-sitter aren't met.
If a tree-sitter major mode shares setup with their ``native'' If a tree-sitter major mode shares setup with its ``native''
counterpart, they can create a ``base mode'' that contains the common counterpart, one can create a ``base mode'' that contains the common
setup, like this: setup, like this:
@example @example
@ -1749,9 +1749,9 @@ setup, like this:
@defun treesit-ready-p language &optional quiet @defun treesit-ready-p language &optional quiet
This function checks for conditions for activating tree-sitter. It This function checks for conditions for activating tree-sitter. It
checks whether Emacs was built with tree-sitter, whether the buffer's checks whether Emacs was built with tree-sitter, whether the buffer's
size is not too large for tree-sitter to handle it, and whether the size is not too large for tree-sitter to handle, and whether the
language grammar for @var{language} is available on the system grammar for @var{language} is available on the system (@pxref{Language
(@pxref{Language Grammar}). Grammar}).
This function emits a warning if tree-sitter cannot be activated. If This function emits a warning if tree-sitter cannot be activated. If
@var{quiet} is @code{message}, the warning is turned into a message; @var{quiet} is @code{message}, the warning is turned into a message;
@ -1789,7 +1789,7 @@ non-@code{nil}, it sets up Imenu.
@end itemize @end itemize
@end defun @end defun
For more information of these built-in tree-sitter features, For more information on these built-in tree-sitter features,
@pxref{Parser-based Font Lock}, @pxref{Parser-based Indentation}, and @pxref{Parser-based Font Lock}, @pxref{Parser-based Indentation}, and
@pxref{List Motion}. @pxref{List Motion}.
@ -1828,28 +1828,17 @@ always returns @code{nil}.
@defvar treesit-defun-name-function @defvar treesit-defun-name-function
If non-@code{nil}, this variable's value should be a function that is If non-@code{nil}, this variable's value should be a function that is
called with a node as its argument, and returns the defun name of the called with a node as its argument, and returns the defun name of the
node. The function should have the same semantic as node. The function should have the same semantics as
@code{treesit-defun-name}: if the node is not a defun node, or the @code{treesit-defun-name}: if the node is not a defun node, or the
node is a defun node but doesn't have a name, or the node is node is a defun node but doesn't have a name, or the node is
@code{nil}, it should return @code{nil}. @code{nil}, it should return @code{nil}.
@end defvar @end defvar
@defvar treesit-defun-type-regexp
This variable determines which nodes are considered defuns by Emacs.
It can be a regexp that matches the type of defun nodes.
Sometimes not all nodes matched by the regexp are valid defuns.
Therefore, this variable can also be a cons cell of the form
@w{(@var{regexp} . @var{pred})}, where @var{pred} should be a function
that takes a node as its argument, and returns @code{t} if the node is
valid defun, or @code{nil} if it is not valid.
@end defvar
@node Tree-sitter C API @node Tree-sitter C API
@section Tree-sitter C API Correspondence @section Tree-sitter C API Correspondence
Emacs' tree-sitter integration doesn't expose every feature Emacs' tree-sitter integration doesn't expose every feature
provided by tree-sitter's C API. Missing features include: provided by tree-sitter's C API@. Missing features include:
@itemize @itemize
@item @item

View file

@ -844,18 +844,25 @@ the mode can get navigation-by-defun functionality for free, by using
@code{treesit-beginning-of-defun} and @code{treesit-end-of-defun}. @code{treesit-beginning-of-defun} and @code{treesit-end-of-defun}.
@defvar treesit-defun-type-regexp @defvar treesit-defun-type-regexp
The value of this variable is a regexp matching the node type of defun This variable determines which nodes are considered defuns by Emacs.
nodes. (For ``node'' and ``node type'', @pxref{Parsing Program Source}.) It can be a regexp that matches the type of defun nodes. (For
``node'' and ``node type'', @pxref{Parsing Program Source}.)
For example, @code{python-mode} sets this variable to a regexp that For example, @code{python-mode} sets this variable to a regexp that
matches either @code{"function_definition"} or @code{"class_definition"}. matches either @samp{function_definition} or @samp{class_definition}.
Sometimes not all nodes matched by the regexp are valid defuns.
Therefore, this variable can also be a cons cell of the form
@w{(@var{regexp} . @var{pred})}, where @var{pred} should be a function
that takes a node as its argument, and returns non-@code{nil} if the
node is a valid defun, or @code{nil} if it is not valid.
@end defvar @end defvar
@defvar treesit-defun-tactic @defvar treesit-defun-tactic
This variable determines how Emacs treats nested defuns. If the This variable determines how Emacs treats nested defuns. If the value
value is @code{top-level}, navigation functions only move across is @code{top-level}, navigation functions only move across top-level
top-level defuns, if the value is @code{nested}, navigation functions defuns. If the value is @code{nested}, navigation functions recognize
recognize nested defuns. nested defuns.
@end defvar @end defvar
@node Skipping Characters @node Skipping Characters

View file

@ -1168,7 +1168,6 @@ See `treesit-simple-indent-presets'.")
(save-excursion (save-excursion
(goto-char bol) (goto-char bol)
(looking-at-p comment-end-skip)))) (looking-at-p comment-end-skip))))
;; TODO: Document.
(cons 'catch-all (lambda (&rest _) t)) (cons 'catch-all (lambda (&rest _) t))
(cons 'query (lambda (pattern) (cons 'query (lambda (pattern)
@ -1182,7 +1181,6 @@ See `treesit-simple-indent-presets'.")
(cons 'first-sibling (lambda (_n parent &rest _) (cons 'first-sibling (lambda (_n parent &rest _)
(treesit-node-start (treesit-node-start
(treesit-node-child parent 0)))) (treesit-node-child parent 0))))
;; TODO: Document.
(cons 'nth-sibling (lambda (n &optional named) (cons 'nth-sibling (lambda (n &optional named)
(lambda (_n parent &rest _) (lambda (_n parent &rest _)
(treesit-node-start (treesit-node-start
@ -1224,7 +1222,6 @@ See `treesit-simple-indent-presets'.")
(or (and this-line-has-prefix (or (and this-line-has-prefix
(match-beginning 1)) (match-beginning 1))
(match-end 0))))))) (match-end 0)))))))
;; TODO: Document.
(cons 'grand-parent (cons 'grand-parent
(lambda (_n parent &rest _) (lambda (_n parent &rest _)
(treesit-node-start (treesit-node-parent parent)))) (treesit-node-start (treesit-node-parent parent))))
@ -1295,10 +1292,10 @@ See `treesit-simple-indent-presets'.")
(mapcar (lambda (fn) (mapcar (lambda (fn)
(funcall fn node parent bol)) (funcall fn node parent bol))
fns))))) fns)))))
"A list of presets. "A list of indent rule presets.
These presets that can be used as MATHER and ANCHOR in These presets can be used as MATCHER and ANCHOR values in
`treesit-simple-indent-rules'. MACHTERs and ANCHORs are `treesit-simple-indent-rules'. MATCHERs and ANCHORs are
functions that take 3 arguments: NODE, PARENT and BOL. functions that take 3 arguments: NODE, PARENT, and BOL.
MATCHER: MATCHER:
@ -1329,6 +1326,10 @@ no-node
Checks that NODE's type matches regexp TYPE. Checks that NODE's type matches regexp TYPE.
\(field-is NAME)
Checks that NODE's field name in PARENT matches regexp NAME.
\(n-p-gp NODE-TYPE PARENT-TYPE GRANDPARENT-TYPE) \(n-p-gp NODE-TYPE PARENT-TYPE GRANDPARENT-TYPE)
Checks for NODE's, its parent's, and its grandparent's type. Checks for NODE's, its parent's, and its grandparent's type.
@ -1342,16 +1343,33 @@ comment-end
Matches if text after point matches `treesit-comment-end'. Matches if text after point matches `treesit-comment-end'.
catch-all
Always matches.
ANCHOR: ANCHOR:
first-sibling first-sibling
Returns the start of the first child of PARENT. Returns the start of the first child of PARENT.
\(nth-sibling N &optional NAMED)
Returns the start of the Nth child of PARENT.
NAMED non-nil means count only named nodes.
parent parent
Returns the start of PARENT. Returns the start of PARENT.
grand-parent
Returns the start of PARENT's parent.
great-grand-parent
Returns the start of PARENT's parent's parent.
parent-bol parent-bol
Returns the beginning of non-space characters on the line where Returns the beginning of non-space characters on the line where
@ -1359,8 +1377,8 @@ parent-bol
standalone-parent standalone-parent
Finds the first ancestor node (parent, grandparent, etc) that Finds the first ancestor node (parent, grandparent, etc.) that
starts on its own line, and return the start of that node. starts on its own line, and returns the start of that node.
prev-sibling prev-sibling
@ -1391,7 +1409,7 @@ prev-adaptive-prefix
end of the match, otherwise return nil. However, if the end of the match, otherwise return nil. However, if the
current line begins with a prefix, return the beginning of current line begins with a prefix, return the beginning of
the prefix of the previous line instead, so that the two the prefix of the previous line instead, so that the two
prefixes aligns. This is useful for a `indent-relative'-like prefixes aligns. This is useful for an `indent-relative'-like
indent behavior for block comments.") indent behavior for block comments.")
(defun treesit--simple-indent-eval (exp) (defun treesit--simple-indent-eval (exp)
@ -2332,24 +2350,24 @@ instead of emitting a warning."
(defun treesit-major-mode-setup () (defun treesit-major-mode-setup ()
"Activate tree-sitter to power major-mode features. "Activate tree-sitter to power major-mode features.
If `treesit-font-lock-settings' is non-nil, setup fontification and If `treesit-font-lock-settings' is non-nil, set up fontification
enable `font-lock-mode'. and enable `font-lock-mode'.
If `treesit-simple-indent-rules' is non-nil, setup indentation. If `treesit-simple-indent-rules' is non-nil, set up indentation.
If `treesit-defun-type-regexp' is non-nil, setup If `treesit-defun-type-regexp' is non-nil, set up
`beginning/end-of-defun' functions. `beginning-of-defun-function' and `end-of-defun-function'.
If `treesit-defun-name-function' is non-nil, setup If `treesit-defun-name-function' is non-nil, set up
`add-log-current-defun'. `add-log-current-defun'.
If `treesit-simple-imenu-settings' is non-nil, setup Imenu. If `treesit-simple-imenu-settings' is non-nil, set up Imenu.
Make sure necessary parsers are created for the current buffer Make sure necessary parsers are created for the current buffer
before calling this function." before calling this function."
;; Font-lock. ;; Font-lock.
(when treesit-font-lock-settings (when treesit-font-lock-settings
;; `font-lock-mode' wouldn't setup properly if ;; `font-lock-mode' wouldn't set up properly if
;; `font-lock-defaults' is nil, see `font-lock-specified-p'. ;; `font-lock-defaults' is nil, see `font-lock-specified-p'.
(setq-local font-lock-defaults (setq-local font-lock-defaults
'( nil nil nil nil '( nil nil nil nil
@ -2803,7 +2821,7 @@ window."
(display-buffer treesit--explorer-buffer (display-buffer treesit--explorer-buffer
(cons nil '((inhibit-same-window . t)))) (cons nil '((inhibit-same-window . t))))
(treesit--explorer-refresh) (treesit--explorer-refresh)
;; Setup variables and hooks. ;; Set up variables and hooks.
(add-hook 'post-command-hook (add-hook 'post-command-hook
#'treesit--explorer-post-command 0 t) #'treesit--explorer-post-command 0 t)
(add-hook 'kill-buffer-hook (add-hook 'kill-buffer-hook