diff options
Diffstat (limited to 'doc/lispref/strings.texi')
-rw-r--r-- | doc/lispref/strings.texi | 67 |
1 files changed, 57 insertions, 10 deletions
diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi index 8420527f858..521f163663d 100644 --- a/doc/lispref/strings.texi +++ b/doc/lispref/strings.texi @@ -121,7 +121,7 @@ character (i.e., an integer), @code{nil} otherwise. The following functions create strings, either from scratch, or by putting strings together, or by taking them apart. -@defun make-string count character +@defun make-string count character &optional multibyte This function returns a string made up of @var{count} repetitions of @var{character}. If @var{count} is negative, an error is signaled. @@ -132,6 +132,13 @@ This function returns a string made up of @var{count} repetitions of @result{} "" @end example + Normally, if @var{character} is an @acronym{ASCII} character, the +result is a unibyte string. But if the optional argument +@var{multibyte} is non-@code{nil}, the function will produce a +multibyte string instead. This is useful when you later need to +concatenate the result with non-@acronym{ASCII} strings or replace +some of its characters with non-@acronym{ASCII} characters. + Other functions to compare with this one include @code{make-vector} (@pxref{Vectors}) and @code{make-list} (@pxref{Building Lists}). @end defun @@ -666,6 +673,28 @@ of the two strings. The sign is negative if @var{string1} (or its specified portion) is less. @end defun +@cindex Levenshtein distance +@cindex distance between strings +@cindex edit distance between strings +@defun string-distance string1 string2 &optional bytecompare +This function returns the @dfn{Levenshtein distance} between the +source string @var{string1} and the target string @var{string2}. The +Levenshtein distance is the number of single-character +changes---deletions, insertions, or replacements---required to +transform the source string into the target string; it is one possible +definition of the @dfn{edit distance} between strings. + +Letter-case of the strings is significant for the computed distance, +but their text properties are ignored. If the optional argument +@var{bytecompare} is non-@code{nil}, the function calculates the +distance in terms of bytes instead of characters. The byte-wise +comparison uses the internal Emacs representation of characters, so it +will produce inaccurate results for multibyte strings that include raw +bytes (@pxref{Text Representations}); make the strings unibyte by +encoding them (@pxref{Explicit Encoding}) if you need accurate results +with raw bytes. +@end defun + @defun assoc-string key alist &optional case-fold This function works like @code{assoc}, except that @var{key} must be a string or symbol, and comparison is done using @code{compare-strings}. @@ -893,18 +922,25 @@ Functions}). Thus, strings are enclosed in @samp{"} characters, and @item %o @cindex integer to octal Replace the specification with the base-eight representation of an -unsigned integer. +integer. Negative integers are formatted in a platform-dependent +way. The object can also be a nonnegative floating-point +number that is formatted as an integer, dropping any fraction, if the +integer does not exceed machine limits. @item %d Replace the specification with the base-ten representation of a signed -integer. +integer. The object can also be a floating-point number that is +formatted as an integer, dropping any fraction. @item %x @itemx %X @cindex integer to hexadecimal Replace the specification with the base-sixteen representation of an -unsigned integer. @samp{%x} uses lower case and @samp{%X} uses upper -case. +integer. Negative integers are formatted in a platform-dependent +way. @samp{%x} uses lower case and @samp{%X} uses upper +case. The object can also be a nonnegative floating-point number that +is formatted as an integer, dropping any fraction, if the integer does +not exceed machine limits. @item %c Replace the specification with the character which is the value given. @@ -981,17 +1017,17 @@ numbered or unnumbered format specifications but not both, except that After the @samp{%} and any field number, you can put certain @dfn{flag characters}. - The flag @samp{+} inserts a plus sign before a positive number, so + The flag @samp{+} inserts a plus sign before a nonnegative number, so that it always has a sign. A space character as flag inserts a space -before a positive number. (Otherwise, positive numbers start with the -first digit.) These flags are useful for ensuring that positive -numbers and negative numbers use the same number of columns. They are +before a nonnegative number. (Otherwise, nonnegative numbers start with the +first digit.) These flags are useful for ensuring that nonnegative +and negative numbers use the same number of columns. They are ignored except for @samp{%d}, @samp{%e}, @samp{%f}, @samp{%g}, and if both flags are used, @samp{+} takes precedence. The flag @samp{#} specifies an alternate form which depends on the format in use. For @samp{%o}, it ensures that the result begins -with a @samp{0}. For @samp{%x} and @samp{%X}, it prefixes the result +with a @samp{0}. For @samp{%x} and @samp{%X}, it prefixes nonzero results with @samp{0x} or @samp{0X}. For @samp{%e} and @samp{%f}, the @samp{#} flag means include a decimal point even if the precision is zero. For @samp{%g}, it always includes a decimal point, and also @@ -1074,6 +1110,17 @@ shows only the first three characters of the representation for precision is what the local library functions of the @code{printf} family produce. +@cindex formatting numbers for rereading later + If you plan to use @code{read} later on the formatted string to +retrieve a copy of the formatted value, use a specification that lets +@code{read} reconstruct the value. To format numbers in this +reversible way you can use @samp{%s} and @samp{%S}, to format just +integers you can also use @samp{%d}, and to format just nonnegative +integers you can also use @samp{#x%x} and @samp{#o%o}. Other formats +may be problematic; for example, @samp{%d} and @samp{%g} can mishandle +NaNs and can lose precision and type, and @samp{#x%x} and @samp{#o%o} +can mishandle negative integers. @xref{Input Functions}. + @node Case Conversion @section Case Conversion in Lisp @cindex upper case |