diff options
author | Yuan Fu <casouri@gmail.com> | 2022-10-05 14:11:33 -0700 |
---|---|---|
committer | Yuan Fu <casouri@gmail.com> | 2022-10-05 14:11:33 -0700 |
commit | cb183f6467401fb5ed2b7fc98ca75be9d943cbe3 (patch) | |
tree | ef42ea6ae71e0829d900ffb46d8306fbba962a8e /admin/notes/tree-sitter/html-manual/Multiple-Languages.html | |
parent | 1ea503ed4b3a14b3dc0a597cfbfe57d73b871422 (diff) | |
download | emacs-cb183f6467401fb5ed2b7fc98ca75be9d943cbe3.tar.gz emacs-cb183f6467401fb5ed2b7fc98ca75be9d943cbe3.tar.bz2 emacs-cb183f6467401fb5ed2b7fc98ca75be9d943cbe3.zip |
Add tree-sitter admin notes
starter-guide: Guide on writing major mode features.
build-module: Script for building official language definitions.
html-manual: HTML version of the manual for easy access.
* admin/notes/tree-sitter/build-module/README: New file.
* admin/notes/tree-sitter/build-module/batch.sh: New file.
* admin/notes/tree-sitter/build-module/build.sh: New file.
* admin/notes/tree-sitter/starter-guide: New file.
* admin/notes/tree-sitter/html-manual/Accessing-Node.html: New file.
* admin/notes/tree-sitter/html-manual/Language-Definitions.html: New file.
* admin/notes/tree-sitter/html-manual/Multiple-Languages.html: New file.
* admin/notes/tree-sitter/html-manual/Parser_002dbased-Font-Lock.html:
New file.
* admin/notes/tree-sitter/html-manual/Parser_002dbased-Indentation.html:
New file.
* admin/notes/tree-sitter/html-manual/Parsing-Program-Source.html: New
file.
* admin/notes/tree-sitter/html-manual/Pattern-Matching.html: New file.
* admin/notes/tree-sitter/html-manual/Retrieving-Node.html: New file.
* admin/notes/tree-sitter/html-manual/Tree_002dsitter-C-API.html: New
file.
* admin/notes/tree-sitter/html-manual/Using-Parser.html: New file.
* admin/notes/tree-sitter/html-manual/build-manual.sh: New file.
* admin/notes/tree-sitter/html-manual/manual.css: New file.
Diffstat (limited to 'admin/notes/tree-sitter/html-manual/Multiple-Languages.html')
-rw-r--r-- | admin/notes/tree-sitter/html-manual/Multiple-Languages.html | 255 |
1 files changed, 255 insertions, 0 deletions
diff --git a/admin/notes/tree-sitter/html-manual/Multiple-Languages.html b/admin/notes/tree-sitter/html-manual/Multiple-Languages.html new file mode 100644 index 00000000000..1ee2df7f442 --- /dev/null +++ b/admin/notes/tree-sitter/html-manual/Multiple-Languages.html @@ -0,0 +1,255 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> +<html> +<!-- Created by GNU Texinfo 6.8, https://www.gnu.org/software/texinfo/ --> +<head> +<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> +<!-- This is the GNU Emacs Lisp Reference Manual +corresponding to Emacs version 29.0.50. + +Copyright © 1990-1996, 1998-2022 Free Software Foundation, +Inc. + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.3 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "GNU General Public License," with the +Front-Cover Texts being "A GNU Manual," and with the Back-Cover +Texts as in (a) below. A copy of the license is included in the +section entitled "GNU Free Documentation License." + +(a) The FSF's Back-Cover Text is: "You have the freedom to copy and +modify this GNU manual. Buying copies from the FSF supports it in +developing GNU and promoting software freedom." --> +<title>Multiple Languages (GNU Emacs Lisp Reference Manual)</title> + +<meta name="description" content="Multiple Languages (GNU Emacs Lisp Reference Manual)"> +<meta name="keywords" content="Multiple Languages (GNU Emacs Lisp Reference Manual)"> +<meta name="resource-type" content="document"> +<meta name="distribution" content="global"> +<meta name="Generator" content="makeinfo"> +<meta name="viewport" content="width=device-width,initial-scale=1"> + +<link href="index.html" rel="start" title="Top"> +<link href="Index.html" rel="index" title="Index"> +<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> +<link href="Parsing-Program-Source.html" rel="up" title="Parsing Program Source"> +<link href="Tree_002dsitter-C-API.html" rel="next" title="Tree-sitter C API"> +<link href="Pattern-Matching.html" rel="prev" title="Pattern Matching"> +<style type="text/css"> +<!-- +a.copiable-anchor {visibility: hidden; text-decoration: none; line-height: 0em} +a.summary-letter {text-decoration: none} +blockquote.indentedblock {margin-right: 0em} +div.display {margin-left: 3.2em} +div.example {margin-left: 3.2em} +kbd {font-style: oblique} +pre.display {font-family: inherit} +pre.format {font-family: inherit} +pre.menu-comment {font-family: serif} +pre.menu-preformatted {font-family: serif} +span.nolinebreak {white-space: nowrap} +span.roman {font-family: initial; font-weight: normal} +span.sansserif {font-family: sans-serif; font-weight: normal} +span:hover a.copiable-anchor {visibility: visible} +ul.no-bullet {list-style: none} +--> +</style> +<link rel="stylesheet" type="text/css" href="./manual.css"> + + +</head> + +<body lang="en"> +<div class="section" id="Multiple-Languages"> +<div class="header"> +<p> +Next: <a href="Tree_002dsitter-C-API.html" accesskey="n" rel="next">Tree-sitter C API Correspondence</a>, Previous: <a href="Pattern-Matching.html" accesskey="p" rel="prev">Pattern Matching Tree-sitter Nodes</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p> +</div> +<hr> +<span id="Parsing-Text-in-Multiple-Languages"></span><h3 class="section">37.6 Parsing Text in Multiple Languages</h3> + +<p>Sometimes, the source of a programming language could contain sources +of other languages, HTML + CSS + JavaScript is one example. In that +case, we need to assign individual parsers to text segments written in +different languages. Traditionally this is achieved by using +narrowing. While tree-sitter works with narrowing (see <a href="Using-Parser.html#tree_002dsitter-narrowing">narrowing</a>), the recommended way is to set ranges in which +a parser will operate. +</p> +<dl class="def"> +<dt id="index-treesit_002dparser_002dset_002dincluded_002dranges"><span class="category">Function: </span><span><strong>treesit-parser-set-included-ranges</strong> <em>parser ranges</em><a href='#index-treesit_002dparser_002dset_002dincluded_002dranges' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function sets the range of <var>parser</var> to <var>ranges</var>. Then +<var>parser</var> will only read the text covered in each range. Each +range in <var>ranges</var> is a list of cons <code>(<var>beg</var> +. <var>end</var>)</code>. +</p> +<p>Each range in <var>ranges</var> must come in order and not overlap. That +is, in pseudo code: +</p> +<div class="example"> +<pre class="example">(cl-loop for idx from 1 to (1- (length ranges)) + for prev = (nth (1- idx) ranges) + for next = (nth idx ranges) + should (<= (car prev) (cdr prev) + (car next) (cdr next))) +</pre></div> + +<span id="index-treesit_002drange_002dinvalid"></span> +<p>If <var>ranges</var> violates this constraint, or something else went +wrong, this function signals a <code>treesit-range-invalid</code>. The +signal data contains a specific error message and the ranges we are +trying to set. +</p> +<p>This function can also be used for disabling ranges. If <var>ranges</var> +is nil, the parser is set to parse the whole buffer. +</p> +<p>Example: +</p> +<div class="example"> +<pre class="example">(treesit-parser-set-included-ranges + parser '((1 . 9) (16 . 24) (24 . 25))) +</pre></div> +</dd></dl> + +<dl class="def"> +<dt id="index-treesit_002dparser_002dincluded_002dranges"><span class="category">Function: </span><span><strong>treesit-parser-included-ranges</strong> <em>parser</em><a href='#index-treesit_002dparser_002dincluded_002dranges' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function returns the ranges set for <var>parser</var>. The return +value is the same as the <var>ranges</var> argument of +<code>treesit-parser-included-ranges</code>: a list of cons +<code>(<var>beg</var> . <var>end</var>)</code>. And if <var>parser</var> doesn’t have any +ranges, the return value is nil. +</p> +<div class="example"> +<pre class="example">(treesit-parser-included-ranges parser) + ⇒ ((1 . 9) (16 . 24) (24 . 25)) +</pre></div> +</dd></dl> + +<dl class="def"> +<dt id="index-treesit_002dset_002dranges"><span class="category">Function: </span><span><strong>treesit-set-ranges</strong> <em>parser-or-lang ranges</em><a href='#index-treesit_002dset_002dranges' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>Like <code>treesit-parser-set-included-ranges</code>, this function sets +the ranges of <var>parser-or-lang</var> to <var>ranges</var>. Conveniently, +<var>parser-or-lang</var> could be either a parser or a language. If it is +a language, this function looks for the first parser in +<code>(treesit-parser-list)</code> for that language in the current buffer, +and set range for it. +</p></dd></dl> + +<dl class="def"> +<dt id="index-treesit_002dget_002dranges"><span class="category">Function: </span><span><strong>treesit-get-ranges</strong> <em>parser-or-lang</em><a href='#index-treesit_002dget_002dranges' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function returns the ranges of <var>parser-or-lang</var>, like +<code>treesit-parser-included-ranges</code>. And like +<code>treesit-set-ranges</code>, <var>parser-or-lang</var> can be a parser or +a language symbol. +</p></dd></dl> + +<dl class="def"> +<dt id="index-treesit_002dquery_002drange"><span class="category">Function: </span><span><strong>treesit-query-range</strong> <em>source query &optional beg end</em><a href='#index-treesit_002dquery_002drange' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function matches <var>source</var> with <var>query</var> and returns the +ranges of captured nodes. The return value has the same shape of +other functions: a list of <code>(<var>beg</var> . <var>end</var>)</code>. +</p> +<p>For convenience, <var>source</var> can be a language symbol, a parser, or a +node. If a language symbol, this function matches in the root node of +the first parser using that language; if a parser, this function +matches in the root node of that parser; if a node, this function +matches in that node. +</p> +<p>Parameter <var>query</var> is the query used to capture nodes +(see <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>). The capture names don’t matter. Parameter +<var>beg</var> and <var>end</var>, if both non-nil, limits the range in which +this function queries. +</p> +<p>Like other query functions, this function raises an +<var>treesit-query-error</var> if <var>query</var> is malformed. +</p></dd></dl> + +<dl class="def"> +<dt id="index-treesit_002dlanguage_002dat"><span class="category">Function: </span><span><strong>treesit-language-at</strong> <em>point</em><a href='#index-treesit_002dlanguage_002dat' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function tries to figure out which language is responsible for +the text at <var>point</var>. It goes over each parser in +<code>(treesit-parser-list)</code> and see if that parser’s range covers +<var>point</var>. +</p></dd></dl> + +<dl class="def"> +<dt id="index-treesit_002drange_002dfunctions"><span class="category">Variable: </span><span><strong>treesit-range-functions</strong><a href='#index-treesit_002drange_002dfunctions' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>A list of range functions. Font-locking and indenting code uses +functions in this alist to set correct ranges for a language parser +before using it. +</p> +<p>The signature of each function should be +</p> +<div class="example"> +<pre class="example">(<var>start</var> <var>end</var> &rest <var>_</var>) +</pre></div> + +<p>where <var>start</var> and <var>end</var> marks the region that is about to be +used. A range function only need to (but not limited to) update +ranges in that region. +</p> +<p>Each function in the list is called in-order. +</p></dd></dl> + +<dl class="def"> +<dt id="index-treesit_002dupdate_002dranges"><span class="category">Function: </span><span><strong>treesit-update-ranges</strong> <em>&optional start end</em><a href='#index-treesit_002dupdate_002dranges' class='copiable-anchor'> ¶</a></span></dt> +<dd><p>This function is used by font-lock and indent to update ranges before +using any parser. Each range function in +<var>treesit-range-functions</var> is called in-order. Arguments +<var>start</var> and <var>end</var> are passed to each range function. +</p></dd></dl> + +<span id="An-example"></span><h3 class="heading">An example</h3> + +<p>Normally, in a set of languages that can be mixed together, there is a +major language and several embedded languages. We first parse the +whole document with the major language’s parser, set ranges for the +embedded languages, then parse the embedded languages. +</p> +<p>Suppose we want to parse a very simple document that mixes HTML, CSS +and JavaScript: +</p> +<div class="example"> +<pre class="example"><html> + <script>1 + 2</script> + <style>body { color: "blue"; }</style> +</html> +</pre></div> + +<p>We first parse with HTML, then set ranges for CSS and JavaScript: +</p> +<div class="example"> +<pre class="example">;; Create parsers. +(setq html (treesit-get-parser-create 'html)) +(setq css (treesit-get-parser-create 'css)) +(setq js (treesit-get-parser-create 'javascript)) + +;; Set CSS ranges. +(setq css-range + (treesit-query-range + 'html + "(style_element (raw_text) @capture)")) +(treesit-parser-set-included-ranges css css-range) + +;; Set JavaScript ranges. +(setq js-range + (treesit-query-range + 'html + "(script_element (raw_text) @capture)")) +(treesit-parser-set-included-ranges js js-range) +</pre></div> + +<p>We use a query pattern <code>(style_element (raw_text) @capture)</code> to +find CSS nodes in the HTML parse tree. For how to write query +patterns, see <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>. +</p> +</div> +<hr> +<div class="header"> +<p> +Next: <a href="Tree_002dsitter-C-API.html">Tree-sitter C API Correspondence</a>, Previous: <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p> +</div> + + + +</body> +</html> |