2022-11-21 15:39:43 +01:00
|
|
|
|
STARTER GUIDE ON WRITING MAJOR MODE WITH TREE-SITTER -*- org -*-
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
This document guides you on adding tree-sitter support to a major
|
|
|
|
|
mode.
|
|
|
|
|
|
|
|
|
|
TOC:
|
|
|
|
|
|
|
|
|
|
- Building Emacs with tree-sitter
|
|
|
|
|
- Install language definitions
|
|
|
|
|
- Setup
|
2022-10-07 18:06:04 -07:00
|
|
|
|
- Naming convention
|
2022-10-05 14:11:33 -07:00
|
|
|
|
- Font-lock
|
|
|
|
|
- Indent
|
|
|
|
|
- Imenu
|
|
|
|
|
- Navigation
|
|
|
|
|
- Which-func
|
|
|
|
|
- More features?
|
|
|
|
|
- Common tasks (code snippets)
|
|
|
|
|
- Manual
|
2023-03-18 14:13:31 -07:00
|
|
|
|
- Appendix 1
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
* Building Emacs with tree-sitter
|
|
|
|
|
|
|
|
|
|
You can either install tree-sitter by your package manager, or from
|
|
|
|
|
source:
|
|
|
|
|
|
|
|
|
|
git clone https://github.com/tree-sitter/tree-sitter.git
|
|
|
|
|
cd tree-sitter
|
|
|
|
|
make
|
|
|
|
|
make install
|
|
|
|
|
|
|
|
|
|
Then pull the tree-sitter branch (or the master branch, if it has
|
|
|
|
|
merged) and rebuild Emacs.
|
|
|
|
|
|
|
|
|
|
* Install language definitions
|
|
|
|
|
|
|
|
|
|
Tree-sitter by itself doesn’t know how to parse any particular
|
|
|
|
|
language. We need to install language definitions (or “grammars”) for
|
|
|
|
|
a language to be able to parse it. There are a couple of ways to get
|
|
|
|
|
them.
|
|
|
|
|
|
|
|
|
|
You can use this script that I put together here:
|
|
|
|
|
|
|
|
|
|
https://github.com/casouri/tree-sitter-module
|
|
|
|
|
|
|
|
|
|
This script automatically pulls and builds language definitions for C,
|
2023-02-10 15:24:45 +00:00
|
|
|
|
C++, Rust, JSON, Go, HTML, JavaScript, CSS, Python, Typescript,
|
2023-03-18 14:13:31 -07:00
|
|
|
|
C#, etc. Better yet, I pre-built these language definitions for
|
2022-10-05 14:11:33 -07:00
|
|
|
|
GNU/Linux and macOS, they can be downloaded here:
|
|
|
|
|
|
|
|
|
|
https://github.com/casouri/tree-sitter-module/releases/tag/v2.1
|
|
|
|
|
|
|
|
|
|
To build them yourself, run
|
|
|
|
|
|
|
|
|
|
git clone git@github.com:casouri/tree-sitter-module.git
|
|
|
|
|
cd tree-sitter-module
|
|
|
|
|
./batch.sh
|
|
|
|
|
|
|
|
|
|
and language definitions will be in the /dist directory. You can
|
|
|
|
|
either copy them to standard dynamic library locations of your system,
|
|
|
|
|
eg, /usr/local/lib, or leave them in /dist and later tell Emacs where
|
|
|
|
|
to find language definitions by setting ‘treesit-extra-load-path’.
|
|
|
|
|
|
|
|
|
|
Language definition sources can be found on GitHub under
|
|
|
|
|
tree-sitter/xxx, like tree-sitter/tree-sitter-python. The tree-sitter
|
|
|
|
|
organization has all the "official" language definitions:
|
|
|
|
|
|
|
|
|
|
https://github.com/tree-sitter
|
|
|
|
|
|
2023-03-18 14:13:31 -07:00
|
|
|
|
Alternatively, you can use treesit-install-language-grammar command
|
|
|
|
|
and follow its instructions. If everything goes right, it should
|
|
|
|
|
automatically download and compile the language grammar for you.
|
|
|
|
|
|
2022-10-05 14:11:33 -07:00
|
|
|
|
* Setting up for adding major mode features
|
|
|
|
|
|
2022-11-21 13:33:03 -08:00
|
|
|
|
Start Emacs and load tree-sitter with
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
(require 'treesit)
|
|
|
|
|
|
|
|
|
|
Now check if Emacs is built with tree-sitter library
|
|
|
|
|
|
|
|
|
|
(treesit-available-p)
|
|
|
|
|
|
2023-03-18 14:13:31 -07:00
|
|
|
|
Make sure Emacs can find the language grammar you want to use
|
|
|
|
|
|
|
|
|
|
(treesit-language-available-p 'lang)
|
|
|
|
|
|
2022-11-21 13:33:03 -08:00
|
|
|
|
* Tree-sitter major modes
|
|
|
|
|
|
|
|
|
|
Tree-sitter modes should be separate major modes, so other modes
|
|
|
|
|
inheriting from the original mode don't break if tree-sitter is
|
|
|
|
|
enabled. For example js2-mode inherits js-mode, we can't enable
|
|
|
|
|
tree-sitter in js-mode, lest js-mode would not setup things that
|
|
|
|
|
js2-mode expects to inherit from. So it's best to use separate major
|
|
|
|
|
modes.
|
|
|
|
|
|
|
|
|
|
If the tree-sitter variant and the "native" variant could share some
|
|
|
|
|
setup, you can create a "base mode", which only contains the common
|
2023-03-18 14:13:31 -07:00
|
|
|
|
setup. For example, python.el defines python-base-mode (shared),
|
|
|
|
|
python-mode (native), and python-ts-mode (tree-sitter).
|
2022-11-21 13:33:03 -08:00
|
|
|
|
|
|
|
|
|
In the tree-sitter mode, check if we can use tree-sitter with
|
|
|
|
|
treesit-ready-p, it will error out if tree-sitter is not ready.
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2023-03-18 14:13:31 -07:00
|
|
|
|
In Emacs 30 we'll introduce some mechanism to more gracefully inherit
|
|
|
|
|
modes and fallback to other modes.
|
|
|
|
|
|
2022-10-07 18:06:04 -07:00
|
|
|
|
* Naming convention
|
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
Use tree-sitter for text (documentation, comment), use treesit for
|
|
|
|
|
symbol (variable, function).
|
2022-10-07 18:06:04 -07:00
|
|
|
|
|
2022-10-05 14:11:33 -07:00
|
|
|
|
* Font-lock
|
|
|
|
|
|
|
|
|
|
Tree-sitter works like this: You provide a query made of patterns and
|
|
|
|
|
capture names, tree-sitter finds the nodes that match these patterns,
|
|
|
|
|
tag the corresponding capture names onto the nodes and return them to
|
|
|
|
|
you. The query function returns a list of (capture-name . node). For
|
|
|
|
|
font-lock, we use face names as capture names. And the captured node
|
2022-11-03 11:41:42 -07:00
|
|
|
|
will be fontified in their capture name.
|
|
|
|
|
|
|
|
|
|
The capture name could also be a function, in which case (NODE
|
|
|
|
|
OVERRIDE START END) is passed to the function for fontification. START
|
2022-11-22 04:40:49 +01:00
|
|
|
|
and END are the start and end of the region to be fontified. The
|
2022-11-03 11:41:42 -07:00
|
|
|
|
function should only fontify within that region. The function should
|
|
|
|
|
also allow more optional arguments with (&rest _), for future
|
|
|
|
|
extensibility. For OVERRIDE check out the docstring of
|
|
|
|
|
treesit-font-lock-rules.
|
|
|
|
|
|
2022-10-05 14:11:33 -07:00
|
|
|
|
** Query syntax
|
|
|
|
|
|
|
|
|
|
There are two types of nodes, named, like (identifier),
|
|
|
|
|
(function_definition), and anonymous, like "return", "def", "(",
|
|
|
|
|
"}". Parent-child relationship is expressed as
|
|
|
|
|
|
|
|
|
|
(parent (child) (child) (child (grand_child)))
|
|
|
|
|
|
|
|
|
|
Eg, an argument list (1, "3", 1) could be:
|
|
|
|
|
|
|
|
|
|
(argument_list "(" (number) (string) (number) ")")
|
|
|
|
|
|
|
|
|
|
Children could have field names in its parent:
|
|
|
|
|
|
|
|
|
|
(function_definition name: (identifier) type: (identifier))
|
|
|
|
|
|
|
|
|
|
Match any of the list:
|
|
|
|
|
|
|
|
|
|
["true" "false" "none"]
|
|
|
|
|
|
|
|
|
|
Capture names can come after any node in the pattern:
|
|
|
|
|
|
|
|
|
|
(parent (child) @child) @parent
|
|
|
|
|
|
|
|
|
|
The query above captures both parent and child.
|
|
|
|
|
|
|
|
|
|
["return" "continue" "break"] @keyword
|
|
|
|
|
|
|
|
|
|
The query above captures all the keywords with capture name
|
|
|
|
|
"keyword".
|
|
|
|
|
|
|
|
|
|
These are the common syntax, see all of them in the manual
|
|
|
|
|
("Parsing Program Source" section).
|
|
|
|
|
|
|
|
|
|
** Query references
|
|
|
|
|
|
2022-11-21 13:33:03 -08:00
|
|
|
|
But how do one come up with the queries? Take python for an example,
|
|
|
|
|
open any python source file, type M-x treesit-explore-mode RET. Now
|
|
|
|
|
you should see the parse-tree in a separate window, automatically
|
|
|
|
|
updated as you select text or edit the buffer. Besides this, you can
|
|
|
|
|
consult the grammar of the language definition. For example, Python’s
|
|
|
|
|
grammar file is at
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
https://github.com/tree-sitter/tree-sitter-python/blob/master/grammar.js
|
|
|
|
|
|
|
|
|
|
Neovim also has a bunch of queries to reference:
|
|
|
|
|
|
|
|
|
|
https://github.com/nvim-treesitter/nvim-treesitter/tree/master/queries
|
|
|
|
|
|
|
|
|
|
The manual explains how to read grammar files in the bottom of section
|
|
|
|
|
"Tree-sitter Language Definitions".
|
|
|
|
|
|
2022-11-22 04:40:49 +01:00
|
|
|
|
** Debugging queries
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
If your query has problems, use ‘treesit-query-validate’ to debug the
|
|
|
|
|
query. It will pop a buffer containing the query (in text format) and
|
|
|
|
|
mark the offending part in red.
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
** Code
|
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
To enable tree-sitter font-lock, set ‘treesit-font-lock-settings’ and
|
|
|
|
|
‘treesit-font-lock-feature-list’ buffer-locally and call
|
|
|
|
|
‘treesit-major-mode-setup’. For example, see
|
2023-03-18 14:13:31 -07:00
|
|
|
|
‘python--treesit-settings’ in python.el. Below is a snippet of it.
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2023-03-18 14:13:31 -07:00
|
|
|
|
Just like the current font-lock, if the to-be-fontified region already
|
|
|
|
|
has a face (ie, an earlier match fontified part/all of the region),
|
|
|
|
|
the new face is discarded rather than applied. If you want later
|
|
|
|
|
matches always override earlier matches, use the :override keyword.
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
Each rule should have a :feature, like function-name,
|
|
|
|
|
string-interpolation, builtin, etc. Users can then enable/disable each
|
2023-03-18 14:13:31 -07:00
|
|
|
|
feature individually. See Appendix 1 at the bottom for a set of common
|
|
|
|
|
features names.
|
2022-11-03 11:41:42 -07:00
|
|
|
|
|
2022-10-05 14:11:33 -07:00
|
|
|
|
#+begin_src elisp
|
|
|
|
|
(defvar python--treesit-settings
|
|
|
|
|
(treesit-font-lock-rules
|
2022-11-03 11:41:42 -07:00
|
|
|
|
:feature 'comment
|
|
|
|
|
:language 'python
|
|
|
|
|
'((comment) @font-lock-comment-face)
|
|
|
|
|
|
|
|
|
|
:feature 'string
|
2022-10-05 14:11:33 -07:00
|
|
|
|
:language 'python
|
2022-11-03 11:41:42 -07:00
|
|
|
|
'((string) @font-lock-string-face
|
|
|
|
|
(string) @contextual) ; Contextual special treatment.
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
:feature 'function-name
|
|
|
|
|
:language 'python
|
|
|
|
|
'((function_definition
|
|
|
|
|
name: (identifier) @font-lock-function-name-face))
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
:feature 'class-name
|
|
|
|
|
:language 'python
|
|
|
|
|
'((class_definition
|
|
|
|
|
name: (identifier) @font-lock-type-face))
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
...))
|
2022-10-05 14:11:33 -07:00
|
|
|
|
#+end_src
|
|
|
|
|
|
|
|
|
|
Then in ‘python-mode’, enable tree-sitter font-lock:
|
|
|
|
|
|
|
|
|
|
#+begin_src elisp
|
|
|
|
|
(treesit-parser-create 'python)
|
2022-11-03 11:41:42 -07:00
|
|
|
|
(setq-local treesit-font-lock-settings python--treesit-settings)
|
|
|
|
|
(setq-local treesit-font-lock-feature-list
|
|
|
|
|
'((comment string function-name)
|
|
|
|
|
(class-name keyword builtin)
|
|
|
|
|
(string-interpolation decorator)))
|
|
|
|
|
...
|
|
|
|
|
(treesit-major-mode-setup)
|
2022-10-05 14:11:33 -07:00
|
|
|
|
#+end_src
|
|
|
|
|
|
|
|
|
|
Concretely, something like this:
|
|
|
|
|
|
|
|
|
|
#+begin_src elisp
|
|
|
|
|
(define-derived-mode python-mode prog-mode "Python"
|
|
|
|
|
...
|
2022-11-03 11:41:42 -07:00
|
|
|
|
(cond
|
|
|
|
|
;; Tree-sitter.
|
2023-01-04 09:57:06 +02:00
|
|
|
|
((treesit-ready-p 'python)
|
2022-11-03 11:41:42 -07:00
|
|
|
|
(treesit-parser-create 'python)
|
|
|
|
|
(setq-local treesit-font-lock-settings python--treesit-settings)
|
|
|
|
|
(setq-local treesit-font-lock-feature-list
|
|
|
|
|
'((comment string function-name)
|
|
|
|
|
(class-name keyword builtin)
|
|
|
|
|
(string-interpolation decorator)))
|
|
|
|
|
(treesit-major-mode-setup))
|
|
|
|
|
(t
|
2023-03-18 14:13:31 -07:00
|
|
|
|
;; No tree-sitter, do nothing or fallback to another mode.
|
2022-11-03 11:41:42 -07:00
|
|
|
|
...)))
|
2022-10-05 14:11:33 -07:00
|
|
|
|
#+end_src
|
|
|
|
|
|
|
|
|
|
* Indent
|
|
|
|
|
|
2022-10-07 01:21:09 -07:00
|
|
|
|
Indent works like this: We have a bunch of rules that look like
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
(MATCHER ANCHOR OFFSET)
|
|
|
|
|
|
2022-10-07 01:21:09 -07:00
|
|
|
|
When the indentation process starts, point is at the BOL of a line, we
|
|
|
|
|
want to know which column to indent this line to. Let NODE be the node
|
|
|
|
|
at point, we pass this node to the MATCHER of each rule, one of them
|
2022-11-22 04:40:49 +01:00
|
|
|
|
will match the node (eg, "this node is a closing bracket!"). Then we
|
|
|
|
|
pass the node to the ANCHOR, which returns a point, eg, the BOL of the
|
2022-10-07 01:21:09 -07:00
|
|
|
|
previous line. We find the column number of that point (eg, 4), add
|
|
|
|
|
OFFSET to it (eg, 0), and that is the column we want to indent the
|
|
|
|
|
current line to (4 + 0 = 4).
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
Matchers and anchors are functions that takes (NODE PARENT BOL &rest
|
|
|
|
|
_). Matches return nil/non-nil for no match/match, and anchors return
|
|
|
|
|
the anchor point. Below are some convenient builtin matchers and anchors.
|
|
|
|
|
|
2023-02-10 15:24:45 +00:00
|
|
|
|
For MATCHER we have
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
(parent-is TYPE) => matches if PARENT’s type matches TYPE as regexp
|
2022-11-21 15:39:43 +01:00
|
|
|
|
(node-is TYPE) => matches NODE’s type
|
2022-10-05 14:11:33 -07:00
|
|
|
|
(query QUERY) => matches if querying PARENT with QUERY
|
|
|
|
|
captures NODE.
|
|
|
|
|
|
|
|
|
|
(match NODE-TYPE PARENT-TYPE NODE-FIELD
|
|
|
|
|
NODE-INDEX-MIN NODE-INDEX-MAX)
|
|
|
|
|
|
|
|
|
|
=> checks everything. If an argument is nil, don’t match that. Eg,
|
2022-12-09 20:55:43 -08:00
|
|
|
|
(match nil TYPE) is the same as (parent-is TYPE)
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
For ANCHOR we have
|
|
|
|
|
|
|
|
|
|
first-sibling => start of the first sibling
|
|
|
|
|
parent => start of parent
|
|
|
|
|
parent-bol => BOL of the line parent is on.
|
2023-03-18 14:13:31 -07:00
|
|
|
|
standalone-parent => Like parent-bol but handles more edge cases
|
2022-11-03 11:41:42 -07:00
|
|
|
|
prev-sibling => start of previous sibling
|
|
|
|
|
no-indent => current position (don’t indent)
|
|
|
|
|
prev-line => start of previous line
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
There is also a manual section for indent: "Parser-based Indentation".
|
|
|
|
|
|
|
|
|
|
When writing indent rules, you can use ‘treesit-check-indent’ to
|
|
|
|
|
check if your indentation is correct. To debug what went wrong, set
|
2022-11-22 04:40:49 +01:00
|
|
|
|
‘treesit--indent-verbose’ to non-nil. Then when you indent, Emacs
|
2022-10-05 14:11:33 -07:00
|
|
|
|
tells you which rule is applied in the echo area.
|
|
|
|
|
|
|
|
|
|
#+begin_src elisp
|
|
|
|
|
(defvar typescript-mode-indent-rules
|
|
|
|
|
(let ((offset typescript-indent-offset))
|
|
|
|
|
`((typescript
|
|
|
|
|
;; This rule matches if node at point is "}", ANCHOR is the
|
|
|
|
|
;; parent node’s BOL, and offset is 0.
|
|
|
|
|
((node-is "}") parent-bol 0)
|
|
|
|
|
((node-is ")") parent-bol 0)
|
|
|
|
|
((node-is "]") parent-bol 0)
|
|
|
|
|
((node-is ">") parent-bol 0)
|
2022-11-03 11:41:42 -07:00
|
|
|
|
((node-is "\\.") parent-bol ,offset)
|
2022-10-05 14:11:33 -07:00
|
|
|
|
((parent-is "ternary_expression") parent-bol ,offset)
|
|
|
|
|
((parent-is "named_imports") parent-bol ,offset)
|
|
|
|
|
((parent-is "statement_block") parent-bol ,offset)
|
|
|
|
|
((parent-is "type_arguments") parent-bol ,offset)
|
|
|
|
|
((parent-is "variable_declarator") parent-bol ,offset)
|
|
|
|
|
((parent-is "arguments") parent-bol ,offset)
|
|
|
|
|
((parent-is "array") parent-bol ,offset)
|
|
|
|
|
((parent-is "formal_parameters") parent-bol ,offset)
|
|
|
|
|
((parent-is "template_substitution") parent-bol ,offset)
|
|
|
|
|
((parent-is "object_pattern") parent-bol ,offset)
|
|
|
|
|
((parent-is "object") parent-bol ,offset)
|
|
|
|
|
((parent-is "object_type") parent-bol ,offset)
|
|
|
|
|
((parent-is "enum_body") parent-bol ,offset)
|
|
|
|
|
((parent-is "arrow_function") parent-bol ,offset)
|
|
|
|
|
((parent-is "parenthesized_expression") parent-bol ,offset)
|
|
|
|
|
...))))
|
|
|
|
|
#+end_src
|
|
|
|
|
|
2023-03-18 14:13:31 -07:00
|
|
|
|
To setup indentation for your major mode, set
|
|
|
|
|
‘treesit-simple-indent-rules’ to your rules, and call
|
2022-11-03 11:41:42 -07:00
|
|
|
|
‘treesit-major-mode-setup’:
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
#+begin_src elisp
|
|
|
|
|
(setq-local treesit-simple-indent-rules typescript-mode-indent-rules)
|
2022-11-03 11:41:42 -07:00
|
|
|
|
(treesit-major-mode-setup)
|
2022-10-05 14:11:33 -07:00
|
|
|
|
#+end_src
|
|
|
|
|
|
|
|
|
|
* Imenu
|
|
|
|
|
|
2023-03-18 14:13:31 -07:00
|
|
|
|
Set ‘treesit-simple-imenu-settings’ and call
|
|
|
|
|
‘treesit-major-mode-setup’.
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
* Navigation
|
|
|
|
|
|
2023-03-18 14:13:31 -07:00
|
|
|
|
Set ‘treesit-defun-type-regexp’ and call
|
|
|
|
|
‘treesit-major-mode-setup’. You can additionally set
|
|
|
|
|
‘treesit-defun-name-function’.
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
* Which-func
|
|
|
|
|
|
2022-11-03 11:41:42 -07:00
|
|
|
|
If you have an imenu implementation, set ‘which-func-functions’ to
|
|
|
|
|
nil, and which-func will automatically use imenu’s data.
|
|
|
|
|
|
2022-11-22 04:40:49 +01:00
|
|
|
|
If you want an independent implementation for which-func, you can
|
2023-03-18 14:13:31 -07:00
|
|
|
|
find the current function by ‘treesit-defun-at-point’.
|
2022-10-05 14:11:33 -07:00
|
|
|
|
|
|
|
|
|
* More features?
|
|
|
|
|
|
|
|
|
|
Obviously this list is just a starting point, if there are features in
|
2022-11-22 04:40:49 +01:00
|
|
|
|
the major mode that would benefit from a parse tree, adding tree-sitter
|
2022-10-05 14:11:33 -07:00
|
|
|
|
support for that would be great. But in the minimal case, just adding
|
|
|
|
|
font-lock is awesome.
|
|
|
|
|
|
|
|
|
|
* Common tasks
|
|
|
|
|
|
|
|
|
|
How to...
|
|
|
|
|
|
|
|
|
|
** Get the buffer text corresponding to a node?
|
|
|
|
|
|
|
|
|
|
(treesit-node-text node)
|
|
|
|
|
|
|
|
|
|
BTW ‘treesit-node-string’ does different things.
|
|
|
|
|
|
|
|
|
|
** Scan the whole tree for stuff?
|
|
|
|
|
|
|
|
|
|
(treesit-search-subtree)
|
|
|
|
|
(treesit-search-forward)
|
|
|
|
|
(treesit-induce-sparse-tree)
|
|
|
|
|
|
|
|
|
|
** Move to next node that...?
|
|
|
|
|
|
|
|
|
|
(treesit-search-forward-goto)
|
|
|
|
|
|
|
|
|
|
** Get the root node?
|
|
|
|
|
|
|
|
|
|
(treesit-buffer-root-node)
|
|
|
|
|
|
|
|
|
|
** Get the node at point?
|
|
|
|
|
|
|
|
|
|
(treesit-node-at (point))
|
|
|
|
|
|
|
|
|
|
* Manual
|
|
|
|
|
|
|
|
|
|
I suggest you read the manual section for tree-sitter in Info. The
|
|
|
|
|
section is Parsing Program Source. Typing
|
|
|
|
|
|
|
|
|
|
C-h i d m elisp RET g Parsing Program Source RET
|
|
|
|
|
|
2023-03-18 14:13:31 -07:00
|
|
|
|
will bring you to that section. You don’t need to read through every
|
|
|
|
|
sentence, just read the text paragraphs and glance over function
|
|
|
|
|
names.
|
|
|
|
|
|
|
|
|
|
* Appendix 1
|
|
|
|
|
|
|
|
|
|
Below is a set of common features used by built-in major mode.
|
|
|
|
|
|
|
|
|
|
Basic tokens:
|
|
|
|
|
|
|
|
|
|
delimiter ,.; (delimit things)
|
|
|
|
|
operator == != || (produces a value)
|
|
|
|
|
bracket []{}()
|
|
|
|
|
misc-punctuation (other punctuation that you want to highlight)
|
|
|
|
|
|
|
|
|
|
constant true, false, null
|
|
|
|
|
number
|
|
|
|
|
keyword
|
|
|
|
|
comment (includes doc-comments)
|
|
|
|
|
string (includes chars and docstrings)
|
|
|
|
|
string-interpolation f"text {variable}"
|
|
|
|
|
escape-sequence "\n\t\\"
|
|
|
|
|
function every function identifier
|
|
|
|
|
variable every variable identifier
|
|
|
|
|
type every type identifier
|
|
|
|
|
property a.b <--- highlight b
|
|
|
|
|
key { a: b, c: d } <--- highlight a, c
|
|
|
|
|
error highlight parse error
|
|
|
|
|
|
|
|
|
|
Abstract features:
|
|
|
|
|
|
|
|
|
|
assignment: the LHS of an assignment (thing being assigned to), eg:
|
|
|
|
|
|
|
|
|
|
a = b <--- highlight a
|
|
|
|
|
a.b = c <--- highlight b
|
|
|
|
|
a[1] = d <--- highlight a
|
|
|
|
|
|
|
|
|
|
definition: the thing being defined, eg:
|
|
|
|
|
|
|
|
|
|
int a(int b) { <--- highlight a
|
|
|
|
|
return 0
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
int a; <-- highlight a
|
|
|
|
|
|
|
|
|
|
struct a { <--- highlight a
|
|
|
|
|
int b; <--- highlight b
|
|
|
|
|
}
|