diff options
Diffstat (limited to 'admin/notes/tree-sitter/html-manual/Language-Definitions.html')
-rw-r--r-- | admin/notes/tree-sitter/html-manual/Language-Definitions.html | 401 |
1 files changed, 0 insertions, 401 deletions
diff --git a/admin/notes/tree-sitter/html-manual/Language-Definitions.html b/admin/notes/tree-sitter/html-manual/Language-Definitions.html deleted file mode 100644 index 9b1e0021272..00000000000 --- a/admin/notes/tree-sitter/html-manual/Language-Definitions.html +++ /dev/null @@ -1,401 +0,0 @@ -<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> -<html> -<!-- Created by GNU Texinfo 6.8, https://www.gnu.org/software/texinfo/ --> -<head> -<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> -<!-- This is the GNU Emacs Lisp Reference Manual -corresponding to Emacs version 29.0.50. - -Copyright © 1990-1996, 1998-2023 Free Software Foundation, Inc. - -Permission is granted to copy, distribute and/or modify this document -under the terms of the GNU Free Documentation License, Version 1.3 or -any later version published by the Free Software Foundation; with the -Invariant Sections being "GNU General Public License," with the -Front-Cover Texts being "A GNU Manual," and with the Back-Cover -Texts as in (a) below. A copy of the license is included in the -section entitled "GNU Free Documentation License." - -(a) The FSF's Back-Cover Text is: "You have the freedom to copy and -modify this GNU manual. Buying copies from the FSF supports it in -developing GNU and promoting software freedom." --> -<title>Language Definitions (GNU Emacs Lisp Reference Manual)</title> - -<meta name="description" content="Language Definitions (GNU Emacs Lisp Reference Manual)"> -<meta name="keywords" content="Language Definitions (GNU Emacs Lisp Reference Manual)"> -<meta name="resource-type" content="document"> -<meta name="distribution" content="global"> -<meta name="Generator" content="makeinfo"> -<meta name="viewport" content="width=device-width,initial-scale=1"> - -<link href="index.html" rel="start" title="Top"> -<link href="Index.html" rel="index" title="Index"> -<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> -<link href="Parsing-Program-Source.html" rel="up" title="Parsing Program Source"> -<link href="Using-Parser.html" rel="next" title="Using Parser"> -<style type="text/css"> -<!-- -a.copiable-anchor {visibility: hidden; text-decoration: none; line-height: 0em} -a.summary-letter {text-decoration: none} -blockquote.indentedblock {margin-right: 0em} -div.display {margin-left: 3.2em} -div.example {margin-left: 3.2em} -kbd {font-style: oblique} -pre.display {font-family: inherit} -pre.format {font-family: inherit} -pre.menu-comment {font-family: serif} -pre.menu-preformatted {font-family: serif} -span.nolinebreak {white-space: nowrap} -span.roman {font-family: initial; font-weight: normal} -span.sansserif {font-family: sans-serif; font-weight: normal} -span:hover a.copiable-anchor {visibility: visible} -ul.no-bullet {list-style: none} ---> -</style> -<link rel="stylesheet" type="text/css" href="./manual.css"> - - -</head> - -<body lang="en"> -<div class="section" id="Language-Definitions"> -<div class="header"> -<p> -Next: <a href="Using-Parser.html" accesskey="n" rel="next">Using Tree-sitter Parser</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p> -</div> -<hr> -<span id="Tree_002dsitter-Language-Definitions"></span><h3 class="section">37.1 Tree-sitter Language Definitions</h3> -<span id="index-language-definitions_002c-for-tree_002dsitter"></span> - -<span id="Loading-a-language-definition"></span><h3 class="heading">Loading a language definition</h3> -<span id="index-loading-language-definition-for-tree_002dsitter"></span> - -<span id="index-language-argument_002c-for-tree_002dsitter"></span> -<p>Tree-sitter relies on language definitions to parse text in that -language. In Emacs, a language definition is represented by a symbol. -For example, the C language definition is represented as the symbol -<code>c</code>, and <code>c</code> can be passed to tree-sitter functions as the -<var>language</var> argument. -</p> -<span id="index-treesit_002dextra_002dload_002dpath"></span> -<span id="index-treesit_002dload_002dlanguage_002derror"></span> -<span id="index-treesit_002dload_002dsuffixes"></span> -<p>Tree-sitter language definitions are distributed as dynamic libraries. -In order to use a language definition in Emacs, you need to make sure -that the dynamic library is installed on the system. Emacs looks for -language definitions in several places, in the following order: -</p> -<ul> -<li> first, in the list of directories specified by the variable -<code>treesit-extra-load-path</code>; -</li><li> then, in the <samp>tree-sitter</samp> subdirectory of the directory -specified by <code>user-emacs-directory</code> (see <a href="Init-File.html">The Init File</a>); -</li><li> and finally, in the system’s default locations for dynamic libraries. -</li></ul> - -<p>In each of these directories, Emacs looks for a file with file-name -extensions specified by the variable <code>dynamic-library-suffixes</code>. -</p> -<p>If Emacs cannot find the library or has problems loading it, Emacs -signals the <code>treesit-load-language-error</code> error. The data of -that signal could be one of the following: -</p> -<dl compact="compact"> -<dt><span><code>(not-found <var>error-msg</var> …)</code></span></dt> -<dd><p>This means that Emacs could not find the language definition library. -</p></dd> -<dt><span><code>(symbol-error <var>error-msg</var>)</code></span></dt> -<dd><p>This means that Emacs could not find in the library the expected function -that every language definition library should export. -</p></dd> -<dt><span><code>(version-mismatch <var>error-msg</var>)</code></span></dt> -<dd><p>This means that the version of language definition library is incompatible -with that of the tree-sitter library. -</p></dd> -</dl> - -<p>In all of these cases, <var>error-msg</var> might provide additional -details about the failure. -</p> -<dl class="def"> -<dt id="index-treesit_002dlanguage_002davailable_002dp"><span class="category">Function: </span><span><strong>treesit-language-available-p</strong> <em>language &optional detail</em><a href='#index-treesit_002dlanguage_002davailable_002dp' class='copiable-anchor'> ¶</a></span></dt> -<dd><p>This function returns non-<code>nil</code> if the language definitions for -<var>language</var> exist and can be loaded. -</p> -<p>If <var>detail</var> is non-<code>nil</code>, return <code>(t . nil)</code> when -<var>language</var> is available, and <code>(nil . <var>data</var>)</code> when it’s -unavailable. <var>data</var> is the signal data of -<code>treesit-load-language-error</code>. -</p></dd></dl> - -<span id="index-treesit_002dload_002dname_002doverride_002dlist"></span> -<p>By convention, the file name of the dynamic library for <var>language</var> is -<samp>libtree-sitter-<var>language</var>.<var>ext</var></samp>, where <var>ext</var> is the -system-specific extension for dynamic libraries. Also by convention, -the function provided by that library is named -<code>tree_sitter_<var>language</var></code>. If a language definition library -doesn’t follow this convention, you should add an entry -</p> -<div class="example"> -<pre class="example">(<var>language</var> <var>library-base-name</var> <var>function-name</var>) -</pre></div> - -<p>to the list in the variable <code>treesit-load-name-override-list</code>, where -<var>library-base-name</var> is the basename of the dynamic library’s file name, -(usually, <samp>libtree-sitter-<var>language</var></samp>), and -<var>function-name</var> is the function provided by the library -(usually, <code>tree_sitter_<var>language</var></code>). For example, -</p> -<div class="example"> -<pre class="example">(cool-lang "libtree-sitter-coool" "tree_sitter_cooool") -</pre></div> - -<p>for a language that considers itself too “cool” to abide by -conventions. -</p> -<span id="index-language_002ddefinition-version_002c-compatibility"></span> -<dl class="def"> -<dt id="index-treesit_002dlanguage_002dversion"><span class="category">Function: </span><span><strong>treesit-language-version</strong> <em>&optional min-compatible</em><a href='#index-treesit_002dlanguage_002dversion' class='copiable-anchor'> ¶</a></span></dt> -<dd><p>This function returns the version of the language-definition -Application Binary Interface (<acronym>ABI</acronym>) supported by the -tree-sitter library. By default, it returns the latest ABI version -supported by the library, but if <var>min-compatible</var> is -non-<code>nil</code>, it returns the oldest ABI version which the library -still can support. Language definition libraries must be built for -ABI versions between the oldest and the latest versions supported by -the tree-sitter library, otherwise the library will be unable to load -them. -</p></dd></dl> - -<span id="Concrete-syntax-tree"></span><h3 class="heading">Concrete syntax tree</h3> -<span id="index-syntax-tree_002c-concrete"></span> - -<p>A syntax tree is what a parser generates. In a syntax tree, each node -represents a piece of text, and is connected to each other by a -parent-child relationship. For example, if the source text is -</p> -<div class="example"> -<pre class="example">1 + 2 -</pre></div> - -<p>its syntax tree could be -</p> -<div class="example"> -<pre class="example"> +--------------+ - | root "1 + 2" | - +--------------+ - | - +--------------------------------+ - | expression "1 + 2" | - +--------------------------------+ - | | | -+------------+ +--------------+ +------------+ -| number "1" | | operator "+" | | number "2" | -+------------+ +--------------+ +------------+ -</pre></div> - -<p>We can also represent it as an s-expression: -</p> -<div class="example"> -<pre class="example">(root (expression (number) (operator) (number))) -</pre></div> - -<span id="Node-types"></span><h4 class="subheading">Node types</h4> -<span id="index-node-types_002c-in-a-syntax-tree"></span> - -<span id="index-type-of-node_002c-tree_002dsitter"></span> -<span id="tree_002dsitter-node-type"></span><span id="index-named-node_002c-tree_002dsitter"></span> -<span id="tree_002dsitter-named-node"></span><span id="index-anonymous-node_002c-tree_002dsitter"></span> -<p>Names like <code>root</code>, <code>expression</code>, <code>number</code>, and -<code>operator</code> specify the <em>type</em> of the nodes. However, not all -nodes in a syntax tree have a type. Nodes that don’t have a type are -known as <em>anonymous nodes</em>, and nodes with a type are <em>named -nodes</em>. Anonymous nodes are tokens with fixed spellings, including -punctuation characters like bracket ‘<samp>]</samp>’, and keywords like -<code>return</code>. -</p> -<span id="Field-names"></span><h4 class="subheading">Field names</h4> - -<span id="index-field-name_002c-tree_002dsitter"></span> -<span id="index-tree_002dsitter-node-field-name"></span> -<span id="tree_002dsitter-node-field-name"></span><p>To make the syntax tree easier to analyze, many language definitions -assign <em>field names</em> to child nodes. For example, a -<code>function_definition</code> node could have a <code>declarator</code> and a -<code>body</code>: -</p> -<div class="example"> -<pre class="example">(function_definition - declarator: (declaration) - body: (compound_statement)) -</pre></div> - -<span id="Exploring-the-syntax-tree"></span><h3 class="heading">Exploring the syntax tree</h3> -<span id="index-explore-tree_002dsitter-syntax-tree"></span> -<span id="index-inspection-of-tree_002dsitter-parse-tree-nodes"></span> - -<p>To aid in understanding the syntax of a language and in debugging of -Lisp program that use the syntax tree, Emacs provides an “explore” -mode, which displays the syntax tree of the source in the current -buffer in real time. Emacs also comes with an “inspect mode”, which -displays information of the nodes at point in the mode-line. -</p> -<dl class="def"> -<dt id="index-treesit_002dexplore_002dmode"><span class="category">Command: </span><span><strong>treesit-explore-mode</strong><a href='#index-treesit_002dexplore_002dmode' class='copiable-anchor'> ¶</a></span></dt> -<dd><p>This mode pops up a window displaying the syntax tree of the source in -the current buffer. Selecting text in the source buffer highlights -the corresponding nodes in the syntax tree display. Clicking -on nodes in the syntax tree highlights the corresponding text in the -source buffer. -</p></dd></dl> - -<dl class="def"> -<dt id="index-treesit_002dinspect_002dmode"><span class="category">Command: </span><span><strong>treesit-inspect-mode</strong><a href='#index-treesit_002dinspect_002dmode' class='copiable-anchor'> ¶</a></span></dt> -<dd><p>This minor mode displays on the mode-line the node that <em>starts</em> -at point. For example, the mode-line can display -</p> -<div class="example"> -<pre class="example"><var>parent</var> <var>field</var>: (<var>node</var> (<var>child</var> (…))) -</pre></div> - -<p>where <var>node</var>, <var>child</var>, etc., are nodes which begin at point. -<var>parent</var> is the parent of <var>node</var>. <var>node</var> is displayed in -a bold typeface. <var>field-name</var>s are field names of <var>node</var> and -of <var>child</var>, etc. -</p> -<p>If no node starts at point, i.e., point is in the middle of a node, -then the mode line displays the earliest node that spans point, and -its immediate parent. -</p> -<p>This minor mode doesn’t create parsers on its own. It uses the first -parser in <code>(treesit-parser-list)</code> (see <a href="Using-Parser.html">Using Tree-sitter Parser</a>). -</p></dd></dl> - -<span id="Reading-the-grammar-definition"></span><h3 class="heading">Reading the grammar definition</h3> -<span id="index-reading-grammar-definition_002c-tree_002dsitter"></span> - -<p>Authors of language definitions define the <em>grammar</em> of a -programming language, which determines how a parser constructs a -concrete syntax tree out of the program text. In order to use the -syntax tree effectively, you need to consult the <em>grammar file</em>. -</p> -<p>The grammar file is usually <samp>grammar.js</samp> in a language -definition’s project repository. The link to a language definition’s -home page can be found on -<a href="https://tree-sitter.github.io/tree-sitter">tree-sitter’s -homepage</a>. -</p> -<p>The grammar definition is written in JavaScript. For example, the -rule matching a <code>function_definition</code> node looks like -</p> -<div class="example"> -<pre class="example">function_definition: $ => seq( - $.declaration_specifiers, - field('declarator', $.declaration), - field('body', $.compound_statement) -) -</pre></div> - -<p>The rules are represented by functions that take a single argument -<var>$</var>, representing the whole grammar. The function itself is -constructed by other functions: the <code>seq</code> function puts together -a sequence of children; the <code>field</code> function annotates a child -with a field name. If we write the above definition in the so-called -<em>Backus-Naur Form</em> (<acronym>BNF</acronym>) syntax, it would look like -</p> -<div class="example"> -<pre class="example">function_definition := - <declaration_specifiers> <declaration> <compound_statement> -</pre></div> - -<p>and the node returned by the parser would look like -</p> -<div class="example"> -<pre class="example">(function_definition - (declaration_specifier) - declarator: (declaration) - body: (compound_statement)) -</pre></div> - -<p>Below is a list of functions that one can see in a grammar definition. -Each function takes other rules as arguments and returns a new rule. -</p> -<dl compact="compact"> -<dt><span><code>seq(<var>rule1</var>, <var>rule2</var>, …)</code></span></dt> -<dd><p>matches each rule one after another. -</p></dd> -<dt><span><code>choice(<var>rule1</var>, <var>rule2</var>, …)</code></span></dt> -<dd><p>matches one of the rules in its arguments. -</p></dd> -<dt><span><code>repeat(<var>rule</var>)</code></span></dt> -<dd><p>matches <var>rule</var> for <em>zero or more</em> times. -This is like the ‘<samp>*</samp>’ operator in regular expressions. -</p></dd> -<dt><span><code>repeat1(<var>rule</var>)</code></span></dt> -<dd><p>matches <var>rule</var> for <em>one or more</em> times. -This is like the ‘<samp>+</samp>’ operator in regular expressions. -</p></dd> -<dt><span><code>optional(<var>rule</var>)</code></span></dt> -<dd><p>matches <var>rule</var> for <em>zero or one</em> time. -This is like the ‘<samp>?</samp>’ operator in regular expressions. -</p></dd> -<dt><span><code>field(<var>name</var>, <var>rule</var>)</code></span></dt> -<dd><p>assigns field name <var>name</var> to the child node matched by <var>rule</var>. -</p></dd> -<dt><span><code>alias(<var>rule</var>, <var>alias</var>)</code></span></dt> -<dd><p>makes nodes matched by <var>rule</var> appear as <var>alias</var> in the syntax -tree generated by the parser. For example, -</p> -<div class="example"> -<pre class="example">alias(preprocessor_call_exp, call_expression) -</pre></div> - -<p>makes any node matched by <code>preprocessor_call_exp</code> appear as -<code>call_expression</code>. -</p></dd> -</dl> - -<p>Below are grammar functions of lesser importance for reading a -language definition. -</p> -<dl compact="compact"> -<dt><span><code>token(<var>rule</var>)</code></span></dt> -<dd><p>marks <var>rule</var> to produce a single leaf node. That is, instead of -generating a parent node with individual child nodes under it, -everything is combined into a single leaf node. See <a href="Retrieving-Nodes.html">Retrieving Nodes</a>. -</p></dd> -<dt><span><code>token.immediate(<var>rule</var>)</code></span></dt> -<dd><p>Normally, grammar rules ignore preceding whitespace; this -changes <var>rule</var> to match only when there is no preceding -whitespaces. -</p></dd> -<dt><span><code>prec(<var>n</var>, <var>rule</var>)</code></span></dt> -<dd><p>gives <var>rule</var> the level-<var>n</var> precedence. -</p></dd> -<dt><span><code>prec.left([<var>n</var>,] <var>rule</var>)</code></span></dt> -<dd><p>marks <var>rule</var> as left-associative, optionally with level <var>n</var>. -</p></dd> -<dt><span><code>prec.right([<var>n</var>,] <var>rule</var>)</code></span></dt> -<dd><p>marks <var>rule</var> as right-associative, optionally with level <var>n</var>. -</p></dd> -<dt><span><code>prec.dynamic(<var>n</var>, <var>rule</var>)</code></span></dt> -<dd><p>this is like <code>prec</code>, but the precedence is applied at runtime -instead. -</p></dd> -</dl> - -<p>The documentation of the tree-sitter project has -<a href="https://tree-sitter.github.io/tree-sitter/creating-parsers">more -about writing a grammar</a>. Read especially “The Grammar DSL” -section. -</p> -</div> -<hr> -<div class="header"> -<p> -Next: <a href="Using-Parser.html">Using Tree-sitter Parser</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p> -</div> - - - -</body> -</html> |