summaryrefslogtreecommitdiff
path: root/admin/notes/tree-sitter/html-manual/Pattern-Matching.html
diff options
context:
space:
mode:
Diffstat (limited to 'admin/notes/tree-sitter/html-manual/Pattern-Matching.html')
-rw-r--r--admin/notes/tree-sitter/html-manual/Pattern-Matching.html199
1 files changed, 110 insertions, 89 deletions
diff --git a/admin/notes/tree-sitter/html-manual/Pattern-Matching.html b/admin/notes/tree-sitter/html-manual/Pattern-Matching.html
index e14efe71629..21eb4702b12 100644
--- a/admin/notes/tree-sitter/html-manual/Pattern-Matching.html
+++ b/admin/notes/tree-sitter/html-manual/Pattern-Matching.html
@@ -34,7 +34,7 @@ developing GNU and promoting software freedom." -->
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Parsing-Program-Source.html" rel="up" title="Parsing Program Source">
<link href="Multiple-Languages.html" rel="next" title="Multiple Languages">
-<link href="Accessing-Node.html" rel="prev" title="Accessing Node">
+<link href="Accessing-Node-Information.html" rel="prev" title="Accessing Node Information">
<style type="text/css">
<!--
a.copiable-anchor {visibility: hidden; text-decoration: none; line-height: 0em}
@@ -63,32 +63,32 @@ ul.no-bullet {list-style: none}
<div class="section" id="Pattern-Matching">
<div class="header">
<p>
-Next: <a href="Multiple-Languages.html" accesskey="n" rel="next">Parsing Text in Multiple Languages</a>, Previous: <a href="Accessing-Node.html" accesskey="p" rel="prev">Accessing Node Information</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Next: <a href="Multiple-Languages.html" accesskey="n" rel="next">Parsing Text in Multiple Languages</a>, Previous: <a href="Accessing-Node-Information.html" accesskey="p" rel="prev">Accessing Node Information</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<span id="Pattern-Matching-Tree_002dsitter-Nodes"></span><h3 class="section">37.5 Pattern Matching Tree-sitter Nodes</h3>
+<span id="index-pattern-matching-with-tree_002dsitter-nodes"></span>
-<p>Tree-sitter let us pattern match with a small declarative language.
-Pattern matching consists of two steps: first tree-sitter matches a
-<em>pattern</em> against nodes in the syntax tree, then it <em>captures</em>
-specific nodes in that pattern and returns the captured nodes.
+<span id="index-capturing_002c-tree_002dsitter-node"></span>
+<p>Tree-sitter lets Lisp programs match patterns using a small
+declarative language. This pattern matching consists of two steps:
+first tree-sitter matches a <em>pattern</em> against nodes in the syntax
+tree, then it <em>captures</em> specific nodes that matched the pattern
+and returns the captured nodes.
</p>
<p>We describe first how to write the most basic query pattern and how to
-capture nodes in a pattern, then the pattern-match function, finally
-more advanced pattern syntax.
+capture nodes in a pattern, then the pattern-matching function, and
+finally the more advanced pattern syntax.
</p>
<span id="Basic-query-syntax"></span><h3 class="heading">Basic query syntax</h3>
-<span id="index-Tree_002dsitter-query-syntax"></span>
-<span id="index-Tree_002dsitter-query-pattern"></span>
+<span id="index-tree_002dsitter-query-pattern-syntax"></span>
+<span id="index-pattern-syntax_002c-tree_002dsitter-query"></span>
+<span id="index-query_002c-tree_002dsitter"></span>
<p>A <em>query</em> consists of multiple <em>patterns</em>. Each pattern is an
s-expression that matches a certain node in the syntax node. A
-pattern has the following shape:
+pattern has the form <code>(<var>type</var>&nbsp;(<var>child</var>&hellip;))</code><!-- /@w -->
</p>
-<div class="example">
-<pre class="example">(<var>type</var> <var>child</var>...)
-</pre></div>
-
<p>For example, a pattern that matches a <code>binary_expression</code> node that
contains <code>number_literal</code> child nodes would look like
</p>
@@ -96,19 +96,20 @@ contains <code>number_literal</code> child nodes would look like
<pre class="example">(binary_expression (number_literal))
</pre></div>
-<p>To <em>capture</em> a node in the query pattern above, append
-<code>@capture-name</code> after the node pattern you want to capture. For
-example,
+<p>To <em>capture</em> a node using the query pattern above, append
+<code>@<var>capture-name</var></code> after the node pattern you want to
+capture. For example,
</p>
<div class="example">
<pre class="example">(binary_expression (number_literal) @number-in-exp)
</pre></div>
<p>captures <code>number_literal</code> nodes that are inside a
-<code>binary_expression</code> node with capture name <code>number-in-exp</code>.
+<code>binary_expression</code> node with the capture name
+<code>number-in-exp</code>.
</p>
-<p>We can capture the <code>binary_expression</code> node too, with capture
-name <code>biexp</code>:
+<p>We can capture the <code>binary_expression</code> node as well, with, for
+example, the capture name <code>biexp</code>:
</p>
<div class="example">
<pre class="example">(binary_expression
@@ -117,34 +118,40 @@ name <code>biexp</code>:
<span id="Query-function"></span><h3 class="heading">Query function</h3>
-<p>Now we can introduce the query functions.
+<span id="index-query-functions_002c-tree_002dsitter"></span>
+<p>Now we can introduce the <em>query functions</em>.
</p>
<dl class="def">
<dt id="index-treesit_002dquery_002dcapture"><span class="category">Function: </span><span><strong>treesit-query-capture</strong> <em>node query &amp;optional beg end node-only</em><a href='#index-treesit_002dquery_002dcapture' class='copiable-anchor'> &para;</a></span></dt>
-<dd><p>This function matches patterns in <var>query</var> in <var>node</var>.
-Parameter <var>query</var> can be either a string, a s-expression, or a
+<dd><p>This function matches patterns in <var>query</var> within <var>node</var>.
+The argument <var>query</var> can be either a string, a s-expression, or a
compiled query object. For now, we focus on the string syntax;
s-expression syntax and compiled query are described at the end of the
section.
</p>
-<p>Parameter <var>node</var> can also be a parser or a language symbol. A
+<p>The argument <var>node</var> can also be a parser or a language symbol. A
parser means using its root node, a language symbol means find or
create a parser for that language in the current buffer, and use the
root node.
</p>
-<p>The function returns all captured nodes in a list of
-<code>(<var>capture_name</var> . <var>node</var>)</code>. If <var>node-only</var> is
-non-nil, a list of node is returned instead. If <var>beg</var> and
-<var>end</var> are both non-nil, this function only pattern matches nodes
-in that range.
+<p>The function returns all the captured nodes in a list of the form
+<code>(<var><span class="nolinebreak">capture_name</span></var>&nbsp;.&nbsp;<var>node</var>)</code><!-- /@w -->. If <var>node-only</var> is
+non-<code>nil</code>, it returns the list of nodes instead. By default the
+entire text of <var>node</var> is searched, but if <var>beg</var> and <var>end</var>
+are both non-<code>nil</code>, they specify the region of buffer text where
+this function should match nodes. Any matching node whose span
+overlaps with the region between <var>beg</var> and <var>end</var> are captured,
+it doesn&rsquo;t have to be completely in the region.
</p>
<span id="index-treesit_002dquery_002derror"></span>
-<p>This function raise a <var>treesit-query-error</var> if <var>query</var> is
-malformed. The signal data contains a description of the specific
-error. You can use <code>treesit-query-validate</code> to debug the query.
+<span id="index-treesit_002dquery_002dvalidate"></span>
+<p>This function raises the <code>treesit-query-error</code> error if
+<var>query</var> is malformed. The signal data contains a description of
+the specific error. You can use <code>treesit-query-validate</code> to
+validate and debug the query.
</p></dd></dl>
-<p>For example, suppose <var>node</var>&rsquo;s content is <code>1 + 2</code>, and
+<p>For example, suppose <var>node</var>&rsquo;s text is <code>1 + 2</code>, and
<var>query</var> is
</p>
<div class="example">
@@ -153,7 +160,7 @@ error. You can use <code>treesit-query-validate</code> to debug the query.
(number_literal) @number-in-exp) @biexp&quot;)
</pre></div>
-<p>Querying that query would return
+<p>Matching that query would return
</p>
<div class="example">
<pre class="example">(treesit-query-capture node query)
@@ -162,8 +169,8 @@ error. You can use <code>treesit-query-validate</code> to debug the query.
(number-in-exp . <var>&lt;node for &quot;2&quot;&gt;</var>))
</pre></div>
-<p>As we mentioned earlier, a <var>query</var> could contain multiple
-patterns. For example, it could have two top-level patterns:
+<p>As mentioned earlier, <var>query</var> could contain multiple patterns.
+For example, it could have two top-level patterns:
</p>
<div class="example">
<pre class="example">(setq query
@@ -173,15 +180,15 @@ patterns. For example, it could have two top-level patterns:
<dl class="def">
<dt id="index-treesit_002dquery_002dstring"><span class="category">Function: </span><span><strong>treesit-query-string</strong> <em>string query language</em><a href='#index-treesit_002dquery_002dstring' class='copiable-anchor'> &para;</a></span></dt>
-<dd><p>This function parses <var>string</var> with <var>language</var>, pattern matches
-its root node with <var>query</var>, and returns the result.
+<dd><p>This function parses <var>string</var> with <var>language</var>, matches its
+root node with <var>query</var>, and returns the result.
</p></dd></dl>
<span id="More-query-syntax"></span><h3 class="heading">More query syntax</h3>
-<p>Besides node type and capture, tree-sitter&rsquo;s query syntax can express
-anonymous node, field name, wildcard, quantification, grouping,
-alternation, anchor, and predicate.
+<p>Besides node type and capture, tree-sitter&rsquo;s pattern syntax can
+express anonymous node, field name, wildcard, quantification,
+grouping, alternation, anchor, and predicate.
</p>
<span id="Anonymous-node"></span><h4 class="subheading">Anonymous node</h4>
@@ -194,9 +201,9 @@ pattern matching (and capturing) keyword <code>return</code> would be
<span id="Wild-card"></span><h4 class="subheading">Wild card</h4>
-<p>In a query pattern, &lsquo;<samp>(_)</samp>&rsquo; matches any named node, and &lsquo;<samp>_</samp>&rsquo;
-matches any named and anonymous node. For example, to capture any
-named child of a <code>binary_expression</code> node, the pattern would be
+<p>In a pattern, &lsquo;<samp>(_)</samp>&rsquo; matches any named node, and &lsquo;<samp>_</samp>&rsquo; matches
+any named and anonymous node. For example, to capture any named child
+of a <code>binary_expression</code> node, the pattern would be
</p>
<div class="example">
<pre class="example">(binary_expression (_) @in_biexp)
@@ -204,7 +211,9 @@ named child of a <code>binary_expression</code> node, the pattern would be
<span id="Field-name"></span><h4 class="subheading">Field name</h4>
-<p>We can capture child nodes that has specific field names:
+<p>It is possible to capture child nodes that have specific field names.
+In the pattern below, <code>declarator</code> and <code>body</code> are field
+names, indicated by the colon following them.
</p>
<div class="example">
<pre class="example">(function_definition
@@ -212,8 +221,8 @@ named child of a <code>binary_expression</code> node, the pattern would be
body: (_) @func-body)
</pre></div>
-<p>We can also capture a node that doesn&rsquo;t have certain field, say, a
-<code>function_definition</code> without a <code>body</code> field.
+<p>It is also possible to capture a node that doesn&rsquo;t have a certain
+field, say, a <code>function_definition</code> without a <code>body</code> field.
</p>
<div class="example">
<pre class="example">(function_definition !body) @func-no-body
@@ -221,19 +230,20 @@ named child of a <code>binary_expression</code> node, the pattern would be
<span id="Quantify-node"></span><h4 class="subheading">Quantify node</h4>
+<span id="index-quantify-node_002c-tree_002dsitter"></span>
<p>Tree-sitter recognizes quantification operators &lsquo;<samp>*</samp>&rsquo;, &lsquo;<samp>+</samp>&rsquo; and
&lsquo;<samp>?</samp>&rsquo;. Their meanings are the same as in regular expressions:
&lsquo;<samp>*</samp>&rsquo; matches the preceding pattern zero or more times, &lsquo;<samp>+</samp>&rsquo;
matches one or more times, and &lsquo;<samp>?</samp>&rsquo; matches zero or one time.
</p>
-<p>For example, this pattern matches <code>type_declaration</code> nodes
-that has <em>zero or more</em> <code>long</code> keyword.
+<p>For example, the following pattern matches <code>type_declaration</code>
+nodes that has <em>zero or more</em> <code>long</code> keyword.
</p>
<div class="example">
<pre class="example">(type_declaration &quot;long&quot;*) @long-type
</pre></div>
-<p>And this pattern matches a type declaration that has zero or one
+<p>The following pattern matches a type declaration that has zero or one
<code>long</code> keyword:
</p>
<div class="example">
@@ -242,8 +252,8 @@ that has <em>zero or more</em> <code>long</code> keyword.
<span id="Grouping"></span><h4 class="subheading">Grouping</h4>
-<p>Similar to groups in regular expression, we can bundle patterns into a
-group and apply quantification operators to it. For example, to
+<p>Similar to groups in regular expression, we can bundle patterns into
+groups and apply quantification operators to them. For example, to
express a comma separated list of identifiers, one could write
</p>
<div class="example">
@@ -253,9 +263,9 @@ express a comma separated list of identifiers, one could write
<span id="Alternation"></span><h4 class="subheading">Alternation</h4>
<p>Again, similar to regular expressions, we can express &ldquo;match anyone
-from this group of patterns&rdquo; in the query pattern. The syntax is a
-list of patterns enclosed in square brackets. For example, to capture
-some keywords in C, the query pattern would be
+from this group of patterns&rdquo; in a pattern. The syntax is a list of
+patterns enclosed in square brackets. For example, to capture some
+keywords in C, the pattern would be
</p>
<div class="example">
<pre class="example">[
@@ -277,11 +287,13 @@ adjacent children:
<div class="example">
<pre class="example">;; Anchor the child with the end of its parent.
(compound_expression (_) @last-child .)
+</pre><pre class="example">
-;; Anchor the child with the beginning of its parent.
+</pre><pre class="example">;; Anchor the child with the beginning of its parent.
(compound_expression . (_) @first-child)
+</pre><pre class="example">
-;; Anchor two adjacent children.
+</pre><pre class="example">;; Anchor two adjacent children.
(compound_expression
(_) @prev-child
.
@@ -293,8 +305,8 @@ nodes.
</p>
<span id="Predicate"></span><h4 class="subheading">Predicate</h4>
-<p>We can add predicate constraints to a pattern. For example, if we use
-the following query pattern
+<p>It is possible to add predicate constraints to a pattern. For
+example, with the following pattern:
</p>
<div class="example">
<pre class="example">(
@@ -303,33 +315,35 @@ the following query pattern
)
</pre></div>
-<p>Then tree-sitter only matches arrays where the first element equals to
+<p>tree-sitter only matches arrays where the first element equals to
the last element. To attach a predicate to a pattern, we need to
-group then together. A predicate always starts with a &lsquo;<samp>#</samp>&rsquo;.
+group them together. A predicate always starts with a &lsquo;<samp>#</samp>&rsquo;.
Currently there are two predicates, <code>#equal</code> and <code>#match</code>.
</p>
<dl class="def">
<dt id="index-equal-1"><span class="category">Predicate: </span><span><strong>equal</strong> <em>arg1 arg2</em><a href='#index-equal-1' class='copiable-anchor'> &para;</a></span></dt>
-<dd><p>Matches if <var>arg1</var> equals to <var>arg2</var>. Arguments can be either a
-string or a capture name. Capture names represent the text that the
+<dd><p>Matches if <var>arg1</var> equals to <var>arg2</var>. Arguments can be either
+strings or capture names. Capture names represent the text that the
captured node spans in the buffer.
</p></dd></dl>
<dl class="def">
-<dt id="index-match"><span class="category">Predicate: </span><span><strong>match</strong> <em>regexp capture-name</em><a href='#index-match' class='copiable-anchor'> &para;</a></span></dt>
-<dd><p>Matches if the text that <var>capture-name</var>’s node spans in the buffer
+<dt id="index-match-1"><span class="category">Predicate: </span><span><strong>match</strong> <em>regexp capture-name</em><a href='#index-match-1' class='copiable-anchor'> &para;</a></span></dt>
+<dd><p>Matches if the text that <var>capture-name</var>&rsquo;s node spans in the buffer
matches regular expression <var>regexp</var>. Matching is case-sensitive.
</p></dd></dl>
-<p>Note that a predicate can only refer to capture names appeared in the
-same pattern. Indeed, it makes little sense to refer to capture names
-in other patterns anyway.
+<p>Note that a predicate can only refer to capture names that appear in
+the same pattern. Indeed, it makes little sense to refer to capture
+names in other patterns.
</p>
<span id="S_002dexpression-patterns"></span><h3 class="heading">S-expression patterns</h3>
-<p>Besides strings, Emacs provides a s-expression based syntax for query
-patterns. It largely resembles the string-based syntax. For example,
-the following pattern
+<span id="index-tree_002dsitter-patterns-as-sexps"></span>
+<span id="index-patterns_002c-tree_002dsitter_002c-in-sexp-form"></span>
+<p>Besides strings, Emacs provides a s-expression based syntax for
+tree-sitter patterns. It largely resembles the string-based syntax.
+For example, the following query
</p>
<div class="example">
<pre class="example">(treesit-query-capture
@@ -353,9 +367,8 @@ the following pattern
[&quot;return&quot; &quot;break&quot;] @keyword))
</pre></div>
-<p>Most pattern syntax can be written directly as strange but
-never-the-less valid s-expressions. Only a few of them needs
-modification:
+<p>Most patterns can be written directly as strange but nevertheless
+valid s-expressions. Only a few of them needs modification:
</p>
<ul>
<li> Anchor &lsquo;<samp>.</samp>&rsquo; is written as <code>:anchor</code>.
@@ -386,42 +399,50 @@ change their &lsquo;<samp>#</samp>&rsquo; to &lsquo;<samp>:</samp>&rsquo;.
<span id="Compiling-queries"></span><h3 class="heading">Compiling queries</h3>
-<p>If a query will be used repeatedly, especially in tight loops, it is
-important to compile that query, because a compiled query is much
-faster than an uncompiled one. A compiled query can be used anywhere
-a query is accepted.
+<span id="index-compiling-tree_002dsitter-queries"></span>
+<span id="index-queries_002c-compiling"></span>
+<p>If a query is intended to be used repeatedly, especially in tight
+loops, it is important to compile that query, because a compiled query
+is much faster than an uncompiled one. A compiled query can be used
+anywhere a query is accepted.
</p>
<dl class="def">
<dt id="index-treesit_002dquery_002dcompile"><span class="category">Function: </span><span><strong>treesit-query-compile</strong> <em>language query</em><a href='#index-treesit_002dquery_002dcompile' class='copiable-anchor'> &para;</a></span></dt>
<dd><p>This function compiles <var>query</var> for <var>language</var> into a compiled
query object and returns it.
</p>
-<p>This function raise a <var>treesit-query-error</var> if <var>query</var> is
-malformed. The signal data contains a description of the specific
-error. You can use <code>treesit-query-validate</code> to debug the query.
+<p>This function raises the <code>treesit-query-error</code> error if
+<var>query</var> is malformed. The signal data contains a description of
+the specific error. You can use <code>treesit-query-validate</code> to
+validate and debug the query.
+</p></dd></dl>
+
+<dl class="def">
+<dt id="index-treesit_002dquery_002dlanguage"><span class="category">Function: </span><span><strong>treesit-query-language</strong> <em>query</em><a href='#index-treesit_002dquery_002dlanguage' class='copiable-anchor'> &para;</a></span></dt>
+<dd><p>This function return the language of <var>query</var>.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dquery_002dexpand"><span class="category">Function: </span><span><strong>treesit-query-expand</strong> <em>query</em><a href='#index-treesit_002dquery_002dexpand' class='copiable-anchor'> &para;</a></span></dt>
-<dd><p>This function expands the s-expression <var>query</var> into a string
-query.
+<dd><p>This function converts the s-expression <var>query</var> into the string
+format.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dpattern_002dexpand"><span class="category">Function: </span><span><strong>treesit-pattern-expand</strong> <em>pattern</em><a href='#index-treesit_002dpattern_002dexpand' class='copiable-anchor'> &para;</a></span></dt>
-<dd><p>This function expands the s-expression <var>pattern</var> into a string
-pattern.
+<dd><p>This function converts the s-expression <var>pattern</var> into the string
+format.
</p></dd></dl>
-<p>Finally, tree-sitter project&rsquo;s documentation about
-pattern-matching can be found at
+<p>For more details, read the tree-sitter project&rsquo;s documentation about
+pattern-matching, which can be found at
<a href="https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries">https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries</a>.
</p>
</div>
<hr>
<div class="header">
<p>
-Next: <a href="Multiple-Languages.html">Parsing Text in Multiple Languages</a>, Previous: <a href="Accessing-Node.html">Accessing Node Information</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Next: <a href="Multiple-Languages.html">Parsing Text in Multiple Languages</a>, Previous: <a href="Accessing-Node-Information.html">Accessing Node Information</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>