diff options
Diffstat (limited to 'admin')
8 files changed, 149 insertions, 82 deletions
diff --git a/admin/notes/tree-sitter/html-manual/Language-Definitions.html b/admin/notes/tree-sitter/html-manual/Language-Definitions.html index 4fd7eb5687f..6dd589f8259 100644 --- a/admin/notes/tree-sitter/html-manual/Language-Definitions.html +++ b/admin/notes/tree-sitter/html-manual/Language-Definitions.html @@ -230,19 +230,38 @@ assign <em>field names</em> to child nodes. For example, a body: (compound_statement)) </pre></div> +<span id="Exploring-the-syntax-tree"></span><h3 class="heading">Exploring the syntax tree</h3> +<span id="index-explore-tree_002dsitter-syntax-tree"></span> +<span id="index-inspection-of-tree_002dsitter-parse-tree-nodes"></span> + +<p>To aid in understanding the syntax of a language and in debugging of +Lisp program that use the syntax tree, Emacs provides an “explore” +mode, which displays the syntax tree of the source in the current +buffer in real time. Emacs also comes with an “inspect mode”, which +displays information of the nodes at point in the mode-line. +</p> +<dl class="def"> +<dt id="index-treesit_002dexplore_002dmode"><span class="category">Command: </span><span><strong>treesit-explore-mode</strong><a href='#index-treesit_002dexplore_002dmode' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This mode pops up a window displaying the syntax tree of the source in +the current buffer. Selecting text in the source buffer highlights +the corresponding nodes in the syntax tree display. Clicking +on nodes in the syntax tree highlights the corresponding text in the +source buffer. +</p></dd></dl> + <dl class="def"> <dt id="index-treesit_002dinspect_002dmode"><span class="category">Command: </span><span><strong>treesit-inspect-mode</strong><a href='#index-treesit_002dinspect_002dmode' class='copiable-anchor'> ¶</a></span></dt> <dd><p>This minor mode displays on the mode-line the node that <em>starts</em> -at point. The mode-line will display +at point. For example, the mode-line can display </p> <div class="example"> <pre class="example"><var>parent</var> <var>field</var>: (<var>node</var> (<var>child</var> (…))) </pre></div> -<p>where <var>node</var>, <var>child</var>, etc, are nodes which begin at point. +<p>where <var>node</var>, <var>child</var>, etc., are nodes which begin at point. <var>parent</var> is the parent of <var>node</var>. <var>node</var> is displayed in -bold typeface. <var>field-name</var>s are field names of <var>node</var> and -<var>child</var>, etc. +a bold typeface. <var>field-name</var>s are field names of <var>node</var> and +of <var>child</var>, etc. </p> <p>If no node starts at point, i.e., point is in the middle of a node, then the mode line displays the earliest node that spans point, and @@ -343,7 +362,7 @@ language definition. <dt><span><code>token(<var>rule</var>)</code></span></dt> <dd><p>marks <var>rule</var> to produce a single leaf node. That is, instead of generating a parent node with individual child nodes under it, -everything is combined into a single leaf node. +everything is combined into a single leaf node. See <a href="Retrieving-Nodes.html">Retrieving Nodes</a>. </p></dd> <dt><span><code>token.immediate(<var>rule</var>)</code></span></dt> <dd><p>Normally, grammar rules ignore preceding whitespace; this diff --git a/admin/notes/tree-sitter/html-manual/Multiple-Languages.html b/admin/notes/tree-sitter/html-manual/Multiple-Languages.html index 6d1800fad72..0ae0b1897e1 100644 --- a/admin/notes/tree-sitter/html-manual/Multiple-Languages.html +++ b/admin/notes/tree-sitter/html-manual/Multiple-Languages.html @@ -273,12 +273,12 @@ takes care of compiling queries and other post-processing, and outputs a value that <var>treesit-range-settings</var> can have. </p> <p>It takes a series of <var>query-spec</var>s, where each <var>query-spec</var> is -a <var>query</var> preceded by zero or more pairs of <var>keyword</var> and -<var>value</var>. Each <var>query</var> is a tree-sitter query in either the +a <var>query</var> preceded by zero or more <var>keyword</var>/<var>value</var> +pairs. Each <var>query</var> is a tree-sitter query in either the string, s-expression or compiled form, or a function. </p> <p>If <var>query</var> is a tree-sitter query, it should be preceeded by two -<var>:keyword</var> <var>value</var> pairs, where the <code>:embed</code> keyword +<var>:keyword</var>/<var>value</var> pairs, where the <code>:embed</code> keyword specifies the embedded language, and the <code>:host</code> keyword specified the host language. </p> diff --git a/admin/notes/tree-sitter/html-manual/Parser_002dbased-Font-Lock.html b/admin/notes/tree-sitter/html-manual/Parser_002dbased-Font-Lock.html index 72d82e6ee6d..e04a730b05c 100644 --- a/admin/notes/tree-sitter/html-manual/Parser_002dbased-Font-Lock.html +++ b/admin/notes/tree-sitter/html-manual/Parser_002dbased-Font-Lock.html @@ -130,17 +130,17 @@ example: </pre></div> <p>This function takes a series of <var>query-spec</var>s, where each -<var>query-spec</var> is a <var>query</var> preceded by multiple pairs of -<var>:keyword</var> and <var>value</var>. Each <var>query</var> is a tree-sitter -query in either the string, s-expression or compiled form. -</p> -<p>For each <var>query</var>, the <var>:keyword</var> and <var>value</var> pairs add -meta information to it. The <code>:lang</code> keyword declares -<var>query</var>’s language. The <code>:feature</code> keyword sets the feature -name of <var>query</var>. Users can control which features are enabled -with <code>font-lock-maximum-decoration</code> and +<var>query-spec</var> is a <var>query</var> preceded by one or more +<var>:keyword</var>/<var>value</var> pairs. Each <var>query</var> is a +tree-sitter query in either the string, s-expression or compiled form. +</p> +<p>For each <var>query</var>, the <var>:keyword</var>/<var>value</var> pairs that +precede it add meta information to it. The <code>:lang</code> keyword +declares <var>query</var>’s language. The <code>:feature</code> keyword sets the +feature name of <var>query</var>. Users can control which features are +enabled with <code>font-lock-maximum-decoration</code> and <code>treesit-font-lock-feature-list</code> (described below). These two -keywords are mandated. +keywords are mandatory. </p> <p>Other keywords are optional: </p> @@ -177,24 +177,6 @@ priority. If a capture name is neither a face nor a function, it is ignored. </p></dd></dl> -<p>Contextual entities, like multi-line strings, or <code>/* */</code> style -comments, need special care, because change in these entities might -cause change in a large portion of the buffer. For example, inserting -the closing comment delimiter <code>*/</code> will change all the text -between it and the opening delimiter to comment face. Such entities -should be captured in a special name <code>contextual</code>, so Emacs can -correctly update their fontification. Here is an example for -comments: -</p> -<div class="example"> -<pre class="example">(treesit-font-lock-rules - :language 'javascript - :feature 'comment - :override t - '((comment) @font-lock-comment-face) - (comment) @contextual)) -</pre></div> - <dl class="def"> <dt id="index-treesit_002dfont_002dlock_002dfeature_002dlist"><span class="category">Variable: </span><span><strong>treesit-font-lock-feature-list</strong><a href='#index-treesit_002dfont_002dlock_002dfeature_002dlist' class='copiable-anchor'> ¶</a></span></dt> <dd><p>This is a list of lists of feature symbols. Each element of the list @@ -208,11 +190,20 @@ activated. list disables the corresponding query during font-lock. </p> <p>Common feature names, for many programming languages, include -function-name, type, variable-name (left-hand-side or <acronym>LHS</acronym> of -assignments), builtin, constant, keyword, string-interpolation, -comment, doc, string, operator, preprocessor, escape-sequence, and key -(in key-value pairs). Major modes are free to subdivide or extend -these common features. +<code>definition</code>, <code>type</code>, <code>assignment</code>, <code>builtin</code>, +<code>constant</code>, <code>keyword</code>, <code>string-interpolation</code>, +<code>comment</code>, <code>doc</code>, <code>string</code>, <code>operator</code>, +<code>preprocessor</code>, <code>escape-sequence</code>, and <code>key</code>. Major +modes are free to subdivide or extend these common features. +</p> +<p>Some of these features warrant some explanation: <code>definition</code> +highlights whatever is being defined, e.g., the function name in a +function definition, the struct name in a struct definition, the +variable name in a variable definition; <code>assignment</code> highlights +the whatever is being assigned to, e.g., the variable or field in an +assignment statement; <code>key</code> highlights keys in key-value pairs, +e.g., keys in a JSON object, or a Python dictionary; <code>doc</code> +highlights docstrings or doc-comments. </p> <p>For example, the value of this variable could be: </p><div class="example"> diff --git a/admin/notes/tree-sitter/html-manual/Parser_002dbased-Indentation.html b/admin/notes/tree-sitter/html-manual/Parser_002dbased-Indentation.html index 5ea1f9bc332..3027bbaae95 100644 --- a/admin/notes/tree-sitter/html-manual/Parser_002dbased-Indentation.html +++ b/admin/notes/tree-sitter/html-manual/Parser_002dbased-Indentation.html @@ -184,6 +184,14 @@ first child where parent is <code>argument_list</code>, use </pre></div> </dd> +<dt id='index-comment_002dend'><span><code>comment-end</code><a href='#index-comment_002dend' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This matcher is a function that is called with 3 arguments: +<var>node</var>, <var>parent</var>, and <var>bol</var>, and returns non-<code>nil</code> if +point is before a comment ending token. Comment ending tokens are +defined by regular expression <code>treesit-comment-end</code> +(see <a href="Tree_002dsitter-major-modes.html">treesit-comment-end</a>). +</p> +</dd> <dt id='index-first_002dsibling'><span><code>first-sibling</code><a href='#index-first_002dsibling' class='copiable-anchor'> ¶</a></span></dt> <dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>, <var>parent</var>, and <var>bol</var>, and returns the start of the first child @@ -219,12 +227,28 @@ charater on the previous line. </p> </dd> <dt id='index-point_002dmin'><span><code>point-min</code><a href='#index-point_002dmin' class='copiable-anchor'> ¶</a></span></dt> -<dd><p>This anchor is a function is called with 3 arguments: <var>node</var>, +<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>, <var>parent</var>, and <var>bol</var>, and returns the beginning of the buffer. This is useful as the beginning of the buffer is always at column 0. +</p> +</dd> +<dt id='index-comment_002dstart'><span><code>comment-start</code><a href='#index-comment_002dstart' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>, +<var>parent</var>, and <var>bol</var>, and returns the position right after the +comment-start token. Comment-start tokens are defined by regular +expression <code>treesit-comment-start</code> (see <a href="Tree_002dsitter-major-modes.html">treesit-comment-start</a>). This function assumes <var>parent</var> is +the comment node. +</p> +</dd> +<dt id='index-coment_002dstart_002dskip'><span><code>coment-start-skip</code><a href='#index-coment_002dstart_002dskip' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>, +<var>parent</var>, and <var>bol</var>, and returns the position after the +comment-start token and any whitespace characters following that +token. Comment-start tokens are defined by regular expression +<code>treesit-comment-start</code>. This function assumes <var>parent</var> is +the comment node. </p></dd> </dl> - </dd></dl> <span id="Indentation-utilities"></span><h3 class="heading">Indentation utilities</h3> diff --git a/admin/notes/tree-sitter/html-manual/Parsing-Program-Source.html b/admin/notes/tree-sitter/html-manual/Parsing-Program-Source.html index ea22421ac4c..a0b5775f11f 100644 --- a/admin/notes/tree-sitter/html-manual/Parsing-Program-Source.html +++ b/admin/notes/tree-sitter/html-manual/Parsing-Program-Source.html @@ -106,7 +106,7 @@ source files that mix multiple programming languages. <ul class="section-toc"> <li><a href="Language-Definitions.html" accesskey="1">Tree-sitter Language Definitions</a></li> <li><a href="Using-Parser.html" accesskey="2">Using Tree-sitter Parser</a></li> -<li><a href="Retrieving-Node.html" accesskey="3">Retrieving Node</a></li> +<li><a href="Retrieving-Nodes.html" accesskey="3">Retrieving Nodes</a></li> <li><a href="Accessing-Node-Information.html" accesskey="4">Accessing Node Information</a></li> <li><a href="Pattern-Matching.html" accesskey="5">Pattern Matching Tree-sitter Nodes</a></li> <li><a href="Multiple-Languages.html" accesskey="6">Parsing Text in Multiple Languages</a></li> diff --git a/admin/notes/tree-sitter/html-manual/Tree_002dsitter-C-API.html b/admin/notes/tree-sitter/html-manual/Tree_002dsitter-C-API.html index a80c2326160..29d51eecf73 100644 --- a/admin/notes/tree-sitter/html-manual/Tree_002dsitter-C-API.html +++ b/admin/notes/tree-sitter/html-manual/Tree_002dsitter-C-API.html @@ -133,7 +133,7 @@ ts_node_is_null ts_node_is_named treesit-node-check ts_node_is_missing treesit-node-check ts_node_is_extra treesit-node-check -ts_node_has_changes treesit-node-check +ts_node_has_changes ts_node_has_error treesit-node-check ts_node_parent treesit-node-parent ts_node_child treesit-node-child diff --git a/admin/notes/tree-sitter/html-manual/Using-Parser.html b/admin/notes/tree-sitter/html-manual/Using-Parser.html index c478a39e556..a4f31f90897 100644 --- a/admin/notes/tree-sitter/html-manual/Using-Parser.html +++ b/admin/notes/tree-sitter/html-manual/Using-Parser.html @@ -33,7 +33,7 @@ developing GNU and promoting software freedom." --> <link href="Index.html" rel="index" title="Index"> <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> <link href="Parsing-Program-Source.html" rel="up" title="Parsing Program Source"> -<link href="Retrieving-Node.html" rel="next" title="Retrieving Node"> +<link href="Retrieving-Nodes.html" rel="next" title="Retrieving Nodes"> <link href="Language-Definitions.html" rel="prev" title="Language Definitions"> <style type="text/css"> <!-- @@ -63,7 +63,7 @@ ul.no-bullet {list-style: none} <div class="section" id="Using-Parser"> <div class="header"> <p> -Next: <a href="Retrieving-Node.html" accesskey="n" rel="next">Retrieving Node</a>, Previous: <a href="Language-Definitions.html" accesskey="p" rel="prev">Tree-sitter Language Definitions</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p> +Next: <a href="Retrieving-Nodes.html" accesskey="n" rel="next">Retrieving Nodes</a>, Previous: <a href="Language-Definitions.html" accesskey="p" rel="prev">Tree-sitter Language Definitions</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p> </div> <hr> <span id="Using-Tree_002dsitter-Parser"></span><h3 class="section">37.2 Using Tree-sitter Parser</h3> @@ -176,11 +176,53 @@ there is no way to update the result. the root node of the generated syntax tree. </p></dd></dl> +<span id="Be-notified-by-changes-to-the-parse-tree"></span><h3 class="heading">Be notified by changes to the parse tree</h3> +<span id="index-update-callback_002c-for-tree_002dsitter-parse_002dtree"></span> +<span id="index-after_002dchange-notifier_002c-for-tree_002dsitter-parse_002dtree"></span> +<span id="index-tree_002dsitter-parse_002dtree_002c-update-and-after_002dchange-callback"></span> +<span id="index-notifiers_002c-tree_002dsitter"></span> + +<p>A Lisp program might want to be notified of text affected by +incremental parsing. For example, inserting a comment-closing token +converts text before that token into a comment. Even +though the text is not directly edited, it is deemed to be “changed” +nevertheless. +</p> +<p>Emacs lets a Lisp program to register callback functions +(a.k.a. <em>notifiers</em>) for this kind of changes. A notifier +function takes two arguments: <var>ranges</var> and <var>parser</var>. +<var>ranges</var> is a list of cons cells of the form <code>(<var>start</var> . <var>end</var>)</code><!-- /@w -->, where <var>start</var> and <var>end</var> mark the start and the +end positions of a range. <var>parser</var> is the parser issuing the +notification. +</p> +<p>Every time a parser reparses a buffer, it compares the old and new +parse-tree, computes the ranges in which nodes have changed, and +passes the ranges to notifier functions. +</p> +<dl class="def"> +<dt id="index-treesit_002dparser_002dadd_002dnotifier"><span class="category">Function: </span><span><strong>treesit-parser-add-notifier</strong> <em>parser function</em><a href='#index-treesit_002dparser_002dadd_002dnotifier' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function adds <var>function</var> to <var>parser</var>’s list of +after-change notifier functions. <var>function</var> must be a function +symbol, not a lambda function (see <a href="Anonymous-Functions.html">Anonymous Functions</a>). +</p></dd></dl> + +<dl class="def"> +<dt id="index-treesit_002dparser_002dremove_002dnotifier"><span class="category">Function: </span><span><strong>treesit-parser-remove-notifier</strong> <em>parser function</em><a href='#index-treesit_002dparser_002dremove_002dnotifier' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function removes <var>function</var> from the list of <var>parser</var>’s +after-change notifier functions. <var>function</var> must be a function +symbol, rather than a lambda function. +</p></dd></dl> + +<dl class="def"> +<dt id="index-treesit_002dparser_002dnotifiers"><span class="category">Function: </span><span><strong>treesit-parser-notifiers</strong> <em>parser</em><a href='#index-treesit_002dparser_002dnotifiers' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function returns the list of <var>parser</var>’s notifier functions. +</p></dd></dl> + </div> <hr> <div class="header"> <p> -Next: <a href="Retrieving-Node.html">Retrieving Node</a>, Previous: <a href="Language-Definitions.html">Tree-sitter Language Definitions</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p> +Next: <a href="Retrieving-Nodes.html">Retrieving Nodes</a>, Previous: <a href="Language-Definitions.html">Tree-sitter Language Definitions</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p> </div> diff --git a/admin/notes/tree-sitter/starter-guide b/admin/notes/tree-sitter/starter-guide index 84118c6f57b..700b020850d 100644 --- a/admin/notes/tree-sitter/starter-guide +++ b/admin/notes/tree-sitter/starter-guide @@ -70,7 +70,7 @@ organization has all the "official" language definitions: * Setting up for adding major mode features -Start Emacs, and load tree-sitter with +Start Emacs and load tree-sitter with (require 'treesit) @@ -78,20 +78,22 @@ Now check if Emacs is built with tree-sitter library (treesit-available-p) -Users toggle tree-sitter for each major mode with a central variable, -‘treesit-settings’. You can check whether to enable tree-sitter with -‘treesit-ready-p’, which takes a major-mode symbol and one or more -language symbol. The major mode body should use a branch like this: +* Tree-sitter major modes -#+begin_src emacs-lisp -(cond - ;; Tree-sitter setup. - ((treesit-ready-p 'python-mode 'python) - ...) - (t - ;; Non-tree-sitter setup. - ...)) -#+end_src +Tree-sitter modes should be separate major modes, so other modes +inheriting from the original mode don't break if tree-sitter is +enabled. For example js2-mode inherits js-mode, we can't enable +tree-sitter in js-mode, lest js-mode would not setup things that +js2-mode expects to inherit from. So it's best to use separate major +modes. + +If the tree-sitter variant and the "native" variant could share some +setup, you can create a "base mode", which only contains the common +setup. For example, there is python-base-mode (shared), python-mode +(native), and python-ts-mode (tree-sitter). + +In the tree-sitter mode, check if we can use tree-sitter with +treesit-ready-p, it will error out if tree-sitter is not ready. * Naming convention @@ -115,14 +117,6 @@ also allow more optional arguments with (&rest _), for future extensibility. For OVERRIDE check out the docstring of treesit-font-lock-rules. -Contextual syntax like multi-line comments and multi-line strings, -needs special care. Because change in this type of things can affect -a large portion of the buffer. Think of inserting a closing comment -delimeter, it causes all the text before it (to the opening comment -delimeter) to change to comment face. These things needs to be -captured in a special name “contextual”, so that Emacs can give them -special treatment. Se the example below for how it looks like. - ** Query syntax There are two types of nodes, named, like (identifier), @@ -159,16 +153,12 @@ These are the common syntax, see all of them in the manual ** Query references -But how do one come up with the queries? Take python for an -example, open any python source file, evaluate - - (treesit-parser-create 'python) - -so there is a parser available, then enable ‘treesit-inspect-mode’. -Now you should see information of the node under point in -mode-line. Move around and you should be able to get a good -picture. Besides this, you can consult the grammar of the language -definition. For example, Python’s grammar file is at +But how do one come up with the queries? Take python for an example, +open any python source file, type M-x treesit-explore-mode RET. Now +you should see the parse-tree in a separate window, automatically +updated as you select text or edit the buffer. Besides this, you can +consult the grammar of the language definition. For example, Python’s +grammar file is at https://github.com/tree-sitter/tree-sitter-python/blob/master/grammar.js @@ -349,8 +339,9 @@ Then you set ‘treesit-simple-indent-rules’ to your rules, and call * Imenu -Not much to say except for utilizing ‘treesit-induce-sparse-tree’. -See ‘js--treesit-imenu-1’ in js.el for an example. +Not much to say except for utilizing ‘treesit-induce-sparse-tree’ (and +explicitly pass a LIMIT argument: most of the time you don't need more +than 10). See ‘js--treesit-imenu-1’ in js.el for an example. Once you have the index builder, set ‘imenu-create-index-function’ to it. |