From 8cd39fb3c4cf47d2464f00eaa69c587e17dd11cc Mon Sep 17 00:00:00 2001 From: "Mark A. Hershberger" Date: Fri, 23 Nov 2007 06:58:00 +0000 Subject: Initial merge of nxml --- doc/emacs/nxml-mode.texi | 834 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 834 insertions(+) create mode 100644 doc/emacs/nxml-mode.texi (limited to 'doc/emacs/nxml-mode.texi') diff --git a/doc/emacs/nxml-mode.texi b/doc/emacs/nxml-mode.texi new file mode 100644 index 00000000000..e2ab3fdbd58 --- /dev/null +++ b/doc/emacs/nxml-mode.texi @@ -0,0 +1,834 @@ +\input texinfo @c -*- texinfo -*- +@c %**start of header +@setfilename nxml-mode.info +@settitle nXML Mode +@c %**end of header + +@dircategory Emacs +@direntry +* nXML Mode: (nxml-mode.info). +@end direntry + +@node Top +@top nXML Mode + +This manual documents nxml-mode, an Emacs major mode for editing +XML with RELAX NG support. This manual is not yet complete. + +@menu +* Completion:: +* Inserting end-tags:: +* Paragraphs:: +* Outlining:: +* Locating a schema:: +* DTDs:: +* Limitations:: +@end menu + +@node Completion +@chapter Completion + +Apart from real-time validation, the most important feature that +nxml-mode provides for assisting in document creation is "completion". +Completion assists the user in inserting characters at point, based on +knowledge of the schema and on the contents of the buffer before +point. + +The traditional GNU Emacs key combination for completion in a +buffer is @kbd{M-@key{TAB}}. However, many window systems +and window managers use this key combination themselves (typically for +switching between windows) and do not pass it to applications. It's +hard to find key combinations in GNU Emacs that are both easy to type +and not taken by something else. @kbd{C-@key{RET}} (i.e. +pressing the Enter or Return key, while the Ctrl key is held down) is +available. It won't be available on a traditional terminal (because +it is indistinguishable from Return), but it will work with a window +system. Therefore we adopt the following solution by default: use +@kbd{C-@key{RET}} when there's a window system and +@kbd{M-@key{TAB}} when there's not. In the following, I +will assume that a window system is being used and will therefore +refer to @kbd{C-@key{RET}}. + +Completion works by examining the symbol preceding point. This +is the symbol to be completed. The symbol to be completed may be the +empty. Completion considers what symbols starting with the symbol to +be completed would be valid replacements for the symbol to be +completed, given the schema and the contents of the buffer before +point. These symbols are the possible completions. An example may +make this clearer. Suppose the buffer looks like this (where @point{} +indicates point): + +@example + + + +<@point{} +@end example + +@noindent +In this case, the symbol to be completed is empty, and the possible +completions are @samp{base}, @samp{isindex}, +@samp{link}, @samp{meta}, @samp{script}, +@samp{style}, @samp{title}. Another example is: + +@example + +<@point{} +@end example + +@noindent +@kbd{C-@key{RET}} will yield + +@example + + + + + + +@end example + +@noindent +This says to use the schema @samp{xhtml.rnc} for a document with +namespace @samp{http://www.w3.org/1999/xhtml}, and to use the +schema @samp{docbook.rnc} for a document whose local name is +@samp{book}. If the document element had both a namespace URI +of @samp{http://www.w3.org/1999/xhtml} and a local name of +@samp{book}, then the matching rule that comes first will be +used and so the schema @samp{xhtml.rnc} would be used. There is +no precedence between different types of rule; the first matching rule +of any type is used. + +As usual with XML-related technologies, resources are identified +by URIs. The @samp{uri} attribute identifies the schema by +specifying the URI. The URI may be relative. If so, it is resolved +relative to the URI of the schema locating file that contains +attribute. This means that if the value of @samp{uri} attribute +does not contain a @samp{/}, then it will refer to a filename in +the same directory as the schema locating file. + +@node Using the document's URI to locate a schema +@subsection Using the document's URI to locate a schema + +A @samp{uri} rule locates a schema based on the URI of the +document. The @samp{uri} attribute specifies the URI of the +schema. The @samp{resource} attribute can be used to specify +the schema for a particular document. For example, + +@example + +@end example + +@noindent +specifies that that the schema for @samp{spec.xml} is +@samp{docbook.rnc}. + +The @samp{pattern} attribute can be used instead of the +@samp{resource} attribute to specify the schema for any document +whose URI matches a pattern. The pattern has the same syntax as an +absolute or relative URI except that the path component of the URI can +use a @samp{*} character to stand for zero or more characters +within a path segment (i.e. any character other @samp{/}). +Typically, the URI pattern looks like a relative URI, but, whereas a +relative URI in the @samp{resource} attribute is resolved into a +particular absolute URI using the base URI of the schema locating +file, a relative URI pattern matches if it matches some number of +complete path segments of the document's URI ending with the last path +segment of the document's URI. For example, + +@example + +@end example + +@noindent +specifies that the schema for documents with a URI whose path ends +with @samp{.xsl} is @samp{xslt.rnc}. + +A @samp{transformURI} rule locates a schema by +transforming the URI of the document. The @samp{fromPattern} +attribute specifies a URI pattern with the same meaning as the +@samp{pattern} attribute of the @samp{uri} element. The +@samp{toPattern} attribute is a URI pattern that is used to +generate the URI of the schema. Each @samp{*} in the +@samp{toPattern} is replaced by the string that matched the +corresponding @samp{*} in the @samp{fromPattern}. The +resulting string is appended to the initial part of the document's URI +that was not explicitly matched by the @samp{fromPattern}. The +rule matches only if the transformed URI identifies an existing +resource. For example, the rule + +@example + +@end example + +@noindent +would transform the URI @samp{file:///home/jjc/docs/spec.xml} +into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this +rule specifies that to locate a schema for a document +@samp{@var{foo}.xml}, Emacs should test whether a file +@samp{@var{foo}.rnc} exists in the same directory as +@samp{@var{foo}.xml}, and, if so, should use it as the +schema. + +@node Using the document element to locate a schema +@subsection Using the document element to locate a schema + +A @samp{documentElement} rule locates a schema based on +the local name and prefix of the document element. For example, a rule + +@example + +@end example + +@noindent +specifies that when the name of the document element is +@samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used +as the schema. Either the @samp{prefix} or +@samp{localName} attribute may be omitted to allow any prefix or +local name. + +A @samp{namespace} rule locates a schema based on the +namespace URI of the document element. For example, a rule + +@example + +@end example + +@noindent +specifies that when the namespace URI of the document is +@samp{http://www.w3.org/1999/XSL/Transform}, then +@samp{xslt.rnc} should be used as the schema. + +@node Using type identifiers in schema locating files +@subsection Using type identifiers in schema locating files + +Type identifiers allow a level of indirection in locating the +schema for a document. Instead of associating the document directly +with a schema URI, the document is associated with a type identifier, +which is in turn associated with a schema URI. nXML mode does not +constrain the format of type identifiers. They can be simply strings +without any formal structure or they can be public identifiers or +URIs. Note that these type identifiers have nothing to do with the +DOCTYPE declaration. When comparing type identifiers, whitespace is +normalized in the same way as with the @samp{xsd:token} +datatype: leading and trailing whitespace is stripped; other sequences +of whitespace are normalized to a single space character. + +Each of the rules described in previous sections that uses a +@samp{uri} attribute to specify a schema, can instead use a +@samp{typeId} attribute to specify a type identifier. The type +identifier can be associated with a URI using a @samp{typeId} +element. For example, + +@example + + + + + + +@end example + +@noindent +declares three type identifiers @samp{XHTML} (representing the +default variant of XHTML to be used), @samp{XHTML Strict} and +@samp{XHTML Transitional}. Such a schema locating file would +use @samp{xhtml-strict.rnc} for a document whose namespace is +@samp{http://www.w3.org/1999/xhtml}. But it is considerably +more flexible than a schema locating file that simply specified + +@example + +@end example + +@noindent +A user can easily use @kbd{C-c C-s C-t} to select between XHTML +Strict and XHTML Transitional. Also, a user can easily add a catalog + +@example + + + +@end example + +@noindent +that makes the default variant of XHTML be XHTML Transitional. + +@node Using multiple schema locating files +@subsection Using multiple schema locating files + +The @samp{include} element includes rules from another +schema locating file. The behavior is exactly as if the rules from +that file were included in place of the @samp{include} element. +Relative URIs are resolved into absolute URIs before the inclusion is +performed. For example, + +@example + +@end example + +@noindent +includes the rules from @samp{rules.xml}. + +The process of locating a schema takes as input a list of schema +locating files. The rules in all these files and in the files they +include are resolved into a single list of rules, which are applied +strictly in order. Sometimes this order is not what is needed. +For example, suppose you have two schema locating files, a private +file + +@example + + + +@end example + +@noindent +followed by a public file + +@example + + + + +@end example + +@noindent +The effect of these two files is that the XHTML @samp{namespace} +rule takes precedence over the @samp{transformURI} rule, which +is almost certainly not what is needed. This can be solved by adding +an @samp{applyFollowingRules} to the private file. + +@example + + + + +@end example + +@node DTDs +@chapter DTDs + +nxml-mode is designed to support the creation of standalone XML +documents that do not depend on a DTD. Although it is common practice +to insert a DOCTYPE declaration referencing an external DTD, this has +undesirable side-effects. It means that the document is no longer +self-contained. It also means that different XML parsers may interpret +the document in different ways, since the XML Recommendation does not +require XML parsers to read the DTD. With DTDs, it was impractical to +get validation without using an external DTD or reference to an +parameter entity. With RELAX NG and other schema languages, you can +simulataneously get the benefits of validation and standalone XML +documents. Therefore, I recommend that you do not reference an +external DOCTYPE in your XML documents. + +One problem is entities for characters. Typically, as well as +providing validation, DTDs also provide a set of character entities +for documents to use. Schemas cannot provide this functionality, +because schema validation happens after XML parsing. The recommended +solution is to either use the Unicode characters directly, or, if this +is impractical, use character references. nXML mode supports this by +providing commands for entering characters and character references +using the Unicode names, and can display the glyph corresponding to a +character reference. + +@node Limitations +@chapter Limitations + +nXML mode has some limitations: + +@itemize @bullet +@item +DTD support is limited. Internal parsed general entities declared +in the internal subset are supported provided they do not contain +elements. Other usage of DTDs is ignored. +@item +The restrictions on RELAX NG schemas in section 7 of the RELAX NG +specification are not enforced. +@item +Unicode support has problems. This stems mostly from the fact that +the XML (and RELAX NG) character model is based squarely on Unicode, +whereas the Emacs character model is not. Emacs 22 is slated to have +full Unicode support, which should improve the situation here. +@end itemize + +@bye -- cgit v1.2.3