diff options
Diffstat (limited to 'doc/lispref/text.texi')
-rw-r--r-- | doc/lispref/text.texi | 466 |
1 files changed, 431 insertions, 35 deletions
diff --git a/doc/lispref/text.texi b/doc/lispref/text.texi index 72fb674aa5a..8b859042ad0 100644 --- a/doc/lispref/text.texi +++ b/doc/lispref/text.texi @@ -59,7 +59,9 @@ the character after point. * Decompression:: Dealing with compressed data. * Base 64:: Conversion to or from base 64 encoding. * Checksum/Hash:: Computing cryptographic hashes. +* Suspicious Text:: Determining whether a string is suspicious. * GnuTLS Cryptography:: Cryptographic algorithms imported from GnuTLS. +* Database:: Interacting with an SQL database. * Parsing HTML/XML:: Parsing HTML and XML. * Parsing JSON:: Parsing and generating JSON values. * JSONRPC:: JSON Remote Procedure Call protocol @@ -241,10 +243,8 @@ using a function specified by the variable The default filter function consults the obsolete wrapper hook @code{filter-buffer-substring-functions} (see the documentation string of the macro @code{with-wrapper-hook} for the details about this -obsolete facility), and the obsolete variable -@code{buffer-substring-filters}. If both of these are @code{nil}, it -returns the unaltered text from the buffer, i.e., what -@code{buffer-substring} would return. +obsolete facility). If it is @code{nil}, it returns the unaltered +text from the buffer, i.e., what @code{buffer-substring} would return. If @var{delete} is non-@code{nil}, the function deletes the text between @var{start} and @var{end} after copying it, like @@ -280,22 +280,12 @@ the same as those of @code{filter-buffer-substring}. The first hook function is passed a @var{fun} that is equivalent to the default operation of @code{filter-buffer-substring}, i.e., it -returns the buffer-substring between @var{start} and @var{end} -(processed by any @code{buffer-substring-filters}) and optionally -deletes the original text from the buffer. In most cases, the hook -function will call @var{fun} once, and then do its own processing of -the result. The next hook function receives a @var{fun} equivalent to -this, and so on. The actual return value is the result of all the -hook functions acting in sequence. -@end defvar - -@defvar buffer-substring-filters -The value of this obsolete variable should be a list of functions -that accept a single string argument and return another string. -The default @code{filter-buffer-substring} function passes the buffer -substring to the first function in this list, and the return value of -each function is passed to the next function. The return value of the -last function is passed to @code{filter-buffer-substring-functions}. +returns the buffer-substring between @var{start} and @var{end} and +optionally deletes the original text from the buffer. In most cases, +the hook function will call @var{fun} once, and then do its own +processing of the result. The next hook function receives a @var{fun} +equivalent to this, and so on. The actual return value is the result +of all the hook functions acting in sequence. @end defvar @defun current-word &optional strict really-word @@ -599,6 +589,19 @@ This command indents to the left margin if that is not zero. The value returned is @code{nil}. @end deffn +@deffn Command ensure-empty-lines &optional number-of-empty-lines +This command can be used to ensure that you have a specific number of +empty lines before point. (An ``empty line'' is here defined as a +line with no characters on it---a line with space characters isn't an +empty line.) It defaults to ensuring that there's a single empty line +before point. + +If point isn't at the beginning of a line, a newline character is +inserted first. If there's more empty lines before point than +specified, the number of empty lines is reduced. Otherwise it's +increased to the specified number. +@end deffn + @defvar overwrite-mode This variable controls whether overwrite mode is in effect. The value should be @code{overwrite-mode-textual}, @code{overwrite-mode-binary}, @@ -1019,6 +1022,9 @@ text in @var{string} according to the @code{yank-handler} text property, as well as the variables @code{yank-handled-properties} and @code{yank-excluded-properties} (see below), before inserting the result into the current buffer. + +@var{string} will be run through @code{yank-transform-functions} (see +below) before inserting. @end defun @defun insert-buffer-substring-as-yank buf &optional start end @@ -1093,6 +1099,23 @@ or specifying key bindings. It takes effect after @code{yank-handled-properties}. @end defopt +@defvar yank-transform-functions +This variable is a list of functions. Each function is called (in +order) with the string to be yanked as the argument, and should +return a (possibly transformed) string. This variable can be set +globally, but can also be used to create new commands that are +variations on @code{yank}. For instance, to create a command that +works like @code{yank}, but cleans up whitespace before inserting, you +could say something like: + +@lisp +(defun yank-with-clean-whitespace () + (interactive) + (let ((yank-transform-functions + '(string-clean-whitespace))) + (call-interactively #'yank))) +@end lisp +@end defvar @node Yank Commands @subsection Functions for Yanking @@ -1329,7 +1352,7 @@ that @kbd{C-y} should yank. @defopt kill-ring-max The value of this variable is the maximum length to which the kill ring can grow, before elements are thrown away at the end. The default -value for @code{kill-ring-max} is 60. +value for @code{kill-ring-max} is 120. @end defopt @node Undo @@ -1493,6 +1516,11 @@ continuing to undo. This function does not bind @code{undo-in-progress}. @end defun +@defmac with-undo-amalgamate body@dots{} +This macro removes all the undo boundaries inserted during the +execution of @var{body} so that it can be undone as a single step. +@end defmac + Some commands leave the region active after execution in such a way that it interferes with selective undo of that command. To make @code{undo} ignore the active region when invoked immediately after such a command, @@ -1633,6 +1661,47 @@ The variable @code{paragraph-separate} controls how to distinguish paragraphs. @xref{Standard Regexps}. @end deffn +@defun pixel-fill-region start end pixel-width +Most Emacs buffers use monospaced text, so all the filling functions +(like @code{fill-region}) work based on the number of characters and +@code{char-width}. However, Emacs can render other types of things, +like text that contains images and using proportional fonts, and the +@code{pixel-fill-region} exists to handle that. It fills the region +of text between @var{start} and @var{end} at pixel granularity, so +text using variable-pitch fonts or several different fonts looks +filled regardless of different character sizes. The argument +@var{pixel-width} specifies the maximum pixel width a line is allowed +to have after filling; it is the pixel-resolution equivalent of the +@code{fill-column} in @code{fill-region}. For instance, this Lisp +snippet will insert text using a proportional font, and then fill this +to be no wider than 300 pixels: + +@lisp +(insert (propertize + "This is a sentence that's ends here." + 'face 'variable-pitch)) +(pixel-fill-region (point) (point-max) 300) +@end lisp + +If @var{start} isn't at the start of a line, the horizontal position +of @var{start}, converted to pixel units, will be used as the +indentation prefix on subsequent lines. + +@findex pixel-fill-width +The @code{pixel-fill-width} helper function can be used to compute the +pixel width to use. If given no arguments, it'll return a value +slightly less than the width of the current window. The first +optional value, @var{columns}, specifies the number of columns using +the standard, monospaced fonts, e.g. @code{fill-column}. The second +optional value is the window to use. You'd typically use it like +this: + +@lisp +(pixel-fill-region + start end (pixel-fill-width fill-column)) +@end lisp +@end defun + @deffn Command fill-individual-paragraphs start end &optional justify citation-regexp This command fills each paragraph in the region according to its individual fill prefix. Thus, if the lines of a paragraph were indented @@ -2915,6 +2984,12 @@ character after position @var{pos} in @var{object} (a buffer or string). The argument @var{object} is optional and defaults to the current buffer. +If @var{position} is at the end of @var{object}, the value is +@code{nil}, but note that buffer narrowing does not affect the value. +That is, if @var{object} is a buffer or @code{nil}, and the buffer is +narrowed and @var{position} is at the end of the narrowed buffer, the +result may be non-@code{nil}. + If there is no @var{prop} property strictly speaking, but the character has a property category that is a symbol, then @code{get-text-property} returns the @var{prop} property of that symbol. @@ -2967,6 +3042,12 @@ properties take precedence over this variable. This function returns the entire property list of the character at @var{position} in the string or buffer @var{object}. If @var{object} is @code{nil}, it defaults to the current buffer. + +If @var{position} is at the end of @var{object}, the value is +@code{nil}, but note that buffer narrowing does not affect the value. +That is, if @var{object} is a buffer or @code{nil}, and the buffer is +narrowed and @var{position} is at the end of the narrowed buffer, the +result may be non-@code{nil}. @end defun @defvar default-text-properties @@ -3399,7 +3480,7 @@ This will give you a list of all those URLs. @end defun @defun text-property-search-backward prop &optional value predicate not-current -This is just like @code{text-property-search-backward}, but searches +This is just like @code{text-property-search-forward}, but searches backward instead. Point is placed at the beginning of the matched region instead of the end, though. @end defun @@ -3487,16 +3568,30 @@ special modes that implement their own highlighting. @item mouse-face @kindex mouse-face @r{(text property)} -This property is used instead of @code{face} when the mouse is on or -near the character. For this purpose, ``near'' means that all text -between the character and where the mouse is have the same -@code{mouse-face} property value. +This property is used instead of @code{face} when the mouse pointer +hovers over the text which has this property. When this happens, the +entire stretch of text that has the same @code{mouse-face} property +value, not just the character under the mouse, is highlighted. Emacs ignores all face attributes from the @code{mouse-face} property that alter the text size (e.g., @code{:height}, @code{:weight}, and @code{:slant}). Those attributes are always the same as for the unhighlighted text. +@item cursor-face +@kindex cursor-face @r{(text property)} +@findex cursor-face-highlight-mode +@vindex cursor-face-highlight-nonselected-window +This property is similar to @code{mouse-face}, but it is used when +point (not the mouse) is inside text that has this property. The +highlighting happens only if the mode +@code{cursor-face-highlight-mode} is enabled. When the variable +@code{cursor-face-highlight-nonselected-window} is non-@code{nil}, the +text with this face is highlighted even if the window is not selected, +similarly to what @code{highlight-nonselected-windows} does for the +region (@pxref{Mark,, The Mark and the Region, emacs, The GNU Emacs +Manual}). + @item fontified @kindex fontified @r{(text property)} This property says whether the text is ready for display. If @@ -3610,6 +3705,11 @@ edited even in read-only buffers. @xref{Read Only Buffers}. A non-@code{nil} @code{invisible} property can make a character invisible on the screen. @xref{Invisible Text}, for details. +@kindex inhibit-isearch @r{(text property)} +@item inhibit-isearch +A non-@code{nil} @code{inhibit-isearch} property will make isearch +skip the text. + @item intangible @kindex intangible @r{(text property)} If a group of consecutive characters have equal and non-@code{nil} @@ -3635,9 +3735,20 @@ property is obsolete; use the @code{cursor-intangible} property instead. @item cursor-intangible @kindex cursor-intangible @r{(text property)} @findex cursor-intangible-mode +@cindex rear-nonsticky, and cursor-intangible property When the minor mode @code{cursor-intangible-mode} is turned on, point is moved away from any position that has a non-@code{nil} @code{cursor-intangible} property, just before redisplay happens. +Note that ``stickiness'' of the property (@pxref{Sticky Properties}) +is taken into account when computing allowed cursor positions, so (for +instance) to insert a stretch of five @samp{x} characters into which +the cursor can't enter, you should do something like: + +@lisp +(insert + (propertize "xxxx" 'cursor-intangible t) + (propertize "x" 'cursor-intangible t 'rear-nonsticky t)) +@end lisp @vindex cursor-sensor-inhibit When the variable @code{cursor-sensor-inhibit} is non-@code{nil}, the @@ -3944,6 +4055,8 @@ of the kill ring. To insert with inheritance, use the special primitives described in this section. Self-inserting characters inherit properties because they work using these primitives. +@cindex front-sticky text property +@cindex rear-nonsticky text property When you do insertion with inheritance, @emph{which} properties are inherited, and from where, depends on which properties are @dfn{sticky}. Insertion after a character inherits those of its properties that are @@ -4176,7 +4289,7 @@ position. The action code is always @code{t}. For example, here is how Info mode handles @key{mouse-1}: @smallexample -(define-key Info-mode-map [follow-link] 'mouse-face) +(keymap-set Info-mode-map "<follow-link>" 'mouse-face) @end smallexample @item a function @@ -4189,9 +4302,9 @@ For example, here is how pcvs enables @kbd{mouse-1} to follow links on file names only: @smallexample -(define-key map [follow-link] - (lambda (pos) - (eq (get-char-property pos 'face) 'cvs-filename-face))) +(keymap-set map "<follow-link>" + (lambda (pos) + (eq (get-char-property pos 'face) 'cvs-filename-face))) @end smallexample @item anything else @@ -4723,9 +4836,8 @@ converting to and from this code. This function converts the region from @var{beg} to @var{end} into base 64 code. It returns the length of the encoded text. An error is signaled if a character in the region is multibyte, i.e., in a -multibyte buffer the region must contain only characters from the -charsets @code{ascii}, @code{eight-bit-control} and -@code{eight-bit-graphic}. +multibyte buffer the region must contain only ASCII characters or raw +bytes. Normally, this function inserts newline characters into the encoded text, to avoid overlong lines. However, if the optional argument @@ -4874,6 +4986,92 @@ It should be somewhat more efficient on larger buffers than @c according to what we find useful. @end defun +@node Suspicious Text +@section Suspicious Text +@cindex suspicious text +@cindex insecure text +@cindex security vulnerabilities in text + + Emacs can display text from many external sources, like email and Web +sites. Attackers may attempt to confuse the user reading this text by +using obfuscated @acronym{URL}s or email addresses, and tricking the +user into visiting a web page they didn't intend to visit, or sending +an email to the wrong address. + +This usually involves using characters from scripts that visually look +like @acronym{ASCII} characters (i.e., are homoglyphs), but there are +also other techniques used, like using bidirectional overrides, or +having an @acronym{HTML} link text that says one thing, while the +underlying @acronym{URL} points somewhere else. + +@cindex suspicious text strings +To help identify these @dfn{suspicious text strings}, Emacs provides a +library to do a number of checks on text. (See +@url{https://www.unicode.org/reports/tr39/, UTS #39: Unicode Security +Mechanisms} for the rationale behind the checks that are available and +more details about them.) Packages that present data that might be +suspicious should use this library to flag suspicious text on display. + +@vindex textsec-check +@defun textsec-suspicious-p object type +This function is the high-level interface function that packages +should use. It respects the @code{textsec-check} user option, which +allows the user to disable the checks. + +This function checks @var{object} (whose data type depends on +@var{type}) to see if it looks suspicious when interpreted as a thing +of @var{type}. The available types and the corresponding @var{object} +data types are: + +@table @code +@item domain +Check whether a domain (e.g., @samp{www.gnu.org} looks suspicious. +@var{object} should be a string, the domain name. + +@item url +Check whether an @acronym{URL} (e.g., @samp{http://gnu.org/foo/bar}) +looks suspicious. @var{object} should be a string, the @acronym{URL} +to check. + +@item link +Check whether an @acronym{HTML} link (e.g., @samp{<a +href='http://gnu.org'>fsf.org</a>} looks suspicious. In this case, +@var{object} should be a @code{cons} cell where the @code{car} is the +@acronym{URL} string, and the @code{cdr} is the link text. The link +is deemed suspicious if the link text contains a domain name, and that +domain name points to something other than the @acronym{URL}. + +@item email-address +Check whether an email address (e.g., @samp{foo@@example.org}) looks +suspicious. @var{object} should be a string. + +@item local-address +Check whether the local part of an email address (the bit before the +@samp{@@} sign) looks suspicious. @var{object} should be a string. + +@item name +Check whether a name (used in an email address header) looks +suspicious. @var{object} should be a string. + +@item email-address-header +Check whether a full RFC2822 email address header (e.g., +@samp{=?utf-8?Q?=C3=81?= <foo@@example.com>}) looks suspicious. +@var{object} should be a string. +@end table + +If @var{object} is suspicious, this function returns a string that +explains why it is suspicious. If @var{object} is not suspicious, the +function returns @code{nil}. +@end defun + +@vindex textsec-suspicious@r{ (face)} +If the text is suspicious, the application should mark the suspicious +text with the @code{textsec-suspicious} face, and make the explanation +returned by @code{textsec-suspicious-p} available to the user in some way +(for example, in a tooltip). The application might also prompt the +user for confirmation before taking any action on a suspicious string +(like sending an email to a suspicious email address). + @node GnuTLS Cryptography @section GnuTLS Cryptography @cindex MD5 checksum @@ -5066,6 +5264,201 @@ On success, it returns a list of a binary string (the output) and the IV used. @end defun +@node Database +@section Database +@cindex database access, SQLite + + Emacs can be compiled with built-in support for accessing SQLite +databases. This section describes the facilities available for +accessing SQLite databases from Lisp programs. + +@defun sqlite-available-p +The function returns non-@code{nil} if built-in SQLite support is +available in this Emacs session. +@end defun + +When SQLite support is available, the following functions can be used. + +@cindex database object +@defun sqlite-open &optional file +This function opens @var{file} as an SQLite database file. If +@var{file} doesn't exist, a new database will be created and stored in +that file. If @var{file} is omitted or @code{nil}, a new in-memory +database is created instead. + +The return value is a @dfn{database object} that can be used as the +argument to most of the subsequent functions described below. +@end defun + +@defun sqlitep object +This predicate returns non-@code{nil} if @var{object} is an SQLite +database object. The database object returned by the +@code{sqlite-open} function satisfies this predicate. +@end defun + +@defun sqlite-close db +Close the database @var{db}. It's usually not necessary to call this +function explicitly---the database will automatically be closed if +Emacs shuts down or the database object is garbage collected. +@end defun + +@defun sqlite-execute db statement &optional values +Execute the @acronym{SQL} @var{statement}. For instance: + +@lisp +(sqlite-execute db "insert into foo values ('bar', 2)") +@end lisp + +If the optional @var{values} parameter is present, it should be either +a list or a vector of values to bind while executing the statement. +For instance: + +@lisp +(sqlite-execute db "insert into foo values (?, ?)" '("bar" 2)) +@end lisp + +This has exactly the same effect as the previous example, but is more +efficient and safer (because it doesn't involve any string parsing or +interpolation). + +@code{sqlite-execute} returns the number of affected rows. For +instance, an @samp{insert} statement will return @samp{1}, whereas an +@samp{update} statement may return zero or a higher number. + +Strings in SQLite are, by default, stored as @code{utf-8}, and +selecting a text column will decode the string using that charset. +Selecting a blob column will return the raw data without any decoding +(i.e., it will return a unibyte string containing the bytes as stored +in the database). Inserting binary data into blob columns, however, +requires some care, as @code{sqlite-execute} will, by default, +interpret all strings as @code{utf-8}. + +So if you have, for instance, @acronym{GIF} data in a unibyte string +called @var{gif}, you have to mark it specially to let +@code{sqlite-execute} know this: + +@lisp +(put-text-property 0 1 'coding-system 'binary gif) +(sqlite-execute db "insert into foo values (?, ?)" (list gif 2)) +@end lisp + +@end defun + +@defun sqlite-select db query &optional values result-type +Select some data from @var{db} and return them. For instance: + +@lisp +(sqlite-select db "select * from foo where key = 2") + @result{} (("bar" 2)) +@end lisp + +As with the @code{sqlite-execute}, you can optionally pass in a list +or a vector of values that will be bound before executing the select: + +@lisp +(sqlite-select db "select * from foo where key = ?" [2]) + @result{} (("bar" 2)) +@end lisp + +This is usually more efficient and safer than the method used by the +previous example. + +By default, this function returns a list of matching rows, where each +row is a list of column values. If @var{return-type} is @code{full}, +the names of the columns (as a list of strings) will be returned as +the first element in the return value. + +@cindex statement object +If @var{return-type} is @code{set}, this function will return a +@dfn{statement object} instead. This object can be examined by using +the @code{sqlite-next}, @code{sqlite-columns} and @code{sqlite-more-p} +functions. If the result set is small, it's often more convenient to +just return the data directly, but if the result set is large (or if +you won't be using all the data from the set), using the @code{set} +method will allocate a lot less memory, and is therefore more +memory-efficient. +@end defun + +@defun sqlite-next statement +This function returns the next row in the result set @var{statement}, +typically an object returned by @code{sqlite-select}. + +@lisp +(sqlite-next stmt) + @result{} ("bar" 2) +@end lisp +@end defun + +@defun sqlite-columns statement +This function returns the column names of the result set +@var{statement}, typically an object returned by @code{sqlite-select}. + +@lisp +(sqlite-columns stmt) + @result{} ("name" "issue") +@end lisp +@end defun + +@defun sqlite-more-p statement +This predicate says whether there is more data to be fetched from the +result set @var{statement}, typically an object returned by +@code{sqlite-select}. +@end defun + +@defun sqlite-finalize statement +If @var{statement} is not going to be used any more, calling this +function will free the resources used by @var{statement}. This is +usually not necessary---when the @var{statement} object is +garbage-collected, Emacs will automatically free its resources. +@end defun + +@defun sqlite-transaction db +Start a transaction in @var{db}. When in a transaction, other readers +of the database won't access the results until the transaction has +been committed by @code{sqlite-commit}. +@end defun + +@defun sqlite-commit db +End a transaction in @var{db} and write the data out to its file. +@end defun + +@defun sqlite-rollback db +End a transaction in @var{db} and discard any changes that have been +made by the transaction. +@end defun + +@defmac with-sqlite-transaction db body@dots{} +Like @code{progn} (@pxref{Sequencing}), but executes @var{body} with a +transaction held, and commits the transaction at the end. +@end defmac + +@defun sqlite-pragma db pragma +Execute @var{pragma} in @var{db}. A @dfn{pragma} is usually a command +that affects the database overall, instead of any particular table. +For instance, to make SQLite automatically garbage collect data that's +no longer needed, you can say: + +@lisp +(sqlite-pragma db "auto_vacuum = FULL") +@end lisp + +This function returns non-@code{nil} on success and @code{nil} if the +pragma failed. Many pragmas can only be issued when the database is +brand new and empty. +@end defun + +@defun sqlite-load-extension db module +Load the named extension @var{module} into the database @var{db}. +Extensions are usually shared-library files; on GNU and Unix systems, +they have the @file{.so} file-name extension. +@end defun + +@findex sqlite-mode-open-file +If you wish to list the contents of an SQLite file, you can use the +@code{sqlite-mode-open-file} command. This will pop to a buffer using +@code{sqlite-mode}, which allows you to examine (and alter) the +contents of an SQLite database. + @node Parsing HTML/XML @section Parsing HTML and XML @cindex parsing html @@ -5080,12 +5473,15 @@ available in this Emacs session. When libxml2 support is available, the following functions can be used to parse HTML or XML text into Lisp object trees. -@defun libxml-parse-html-region start end &optional base-url discard-comments +@defun libxml-parse-html-region &optional start end base-url discard-comments This function parses the text between @var{start} and @var{end} as HTML, and returns a list representing the HTML @dfn{parse tree}. It attempts to handle real-world HTML by robustly coping with syntax mistakes. +If @var{start} or @var{end} are @code{nil}, they default to the values +from @code{point-min} and @code{point-max}, respectively. + The optional argument @var{base-url}, if non-@code{nil}, should be a string specifying the base URL for relative URLs occurring in links. @@ -5131,7 +5527,7 @@ buffer. The argument @var{dom} should be a list as generated by @end defun @cindex parsing xml -@defun libxml-parse-xml-region start end &optional base-url discard-comments +@defun libxml-parse-xml-region &optional start end base-url discard-comments This function is the same as @code{libxml-parse-html-region}, except that it parses the text as XML rather than HTML (so it is stricter about syntax). |