summaryrefslogtreecommitdiff
path: root/doc/lispref/strings.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/lispref/strings.texi')
-rw-r--r--doc/lispref/strings.texi67
1 files changed, 57 insertions, 10 deletions
diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi
index 8420527f858..521f163663d 100644
--- a/doc/lispref/strings.texi
+++ b/doc/lispref/strings.texi
@@ -121,7 +121,7 @@ character (i.e., an integer), @code{nil} otherwise.
The following functions create strings, either from scratch, or by
putting strings together, or by taking them apart.
-@defun make-string count character
+@defun make-string count character &optional multibyte
This function returns a string made up of @var{count} repetitions of
@var{character}. If @var{count} is negative, an error is signaled.
@@ -132,6 +132,13 @@ This function returns a string made up of @var{count} repetitions of
@result{} ""
@end example
+ Normally, if @var{character} is an @acronym{ASCII} character, the
+result is a unibyte string. But if the optional argument
+@var{multibyte} is non-@code{nil}, the function will produce a
+multibyte string instead. This is useful when you later need to
+concatenate the result with non-@acronym{ASCII} strings or replace
+some of its characters with non-@acronym{ASCII} characters.
+
Other functions to compare with this one include @code{make-vector}
(@pxref{Vectors}) and @code{make-list} (@pxref{Building Lists}).
@end defun
@@ -666,6 +673,28 @@ of the two strings. The sign is negative if @var{string1} (or its
specified portion) is less.
@end defun
+@cindex Levenshtein distance
+@cindex distance between strings
+@cindex edit distance between strings
+@defun string-distance string1 string2 &optional bytecompare
+This function returns the @dfn{Levenshtein distance} between the
+source string @var{string1} and the target string @var{string2}. The
+Levenshtein distance is the number of single-character
+changes---deletions, insertions, or replacements---required to
+transform the source string into the target string; it is one possible
+definition of the @dfn{edit distance} between strings.
+
+Letter-case of the strings is significant for the computed distance,
+but their text properties are ignored. If the optional argument
+@var{bytecompare} is non-@code{nil}, the function calculates the
+distance in terms of bytes instead of characters. The byte-wise
+comparison uses the internal Emacs representation of characters, so it
+will produce inaccurate results for multibyte strings that include raw
+bytes (@pxref{Text Representations}); make the strings unibyte by
+encoding them (@pxref{Explicit Encoding}) if you need accurate results
+with raw bytes.
+@end defun
+
@defun assoc-string key alist &optional case-fold
This function works like @code{assoc}, except that @var{key} must be a
string or symbol, and comparison is done using @code{compare-strings}.
@@ -893,18 +922,25 @@ Functions}). Thus, strings are enclosed in @samp{"} characters, and
@item %o
@cindex integer to octal
Replace the specification with the base-eight representation of an
-unsigned integer.
+integer. Negative integers are formatted in a platform-dependent
+way. The object can also be a nonnegative floating-point
+number that is formatted as an integer, dropping any fraction, if the
+integer does not exceed machine limits.
@item %d
Replace the specification with the base-ten representation of a signed
-integer.
+integer. The object can also be a floating-point number that is
+formatted as an integer, dropping any fraction.
@item %x
@itemx %X
@cindex integer to hexadecimal
Replace the specification with the base-sixteen representation of an
-unsigned integer. @samp{%x} uses lower case and @samp{%X} uses upper
-case.
+integer. Negative integers are formatted in a platform-dependent
+way. @samp{%x} uses lower case and @samp{%X} uses upper
+case. The object can also be a nonnegative floating-point number that
+is formatted as an integer, dropping any fraction, if the integer does
+not exceed machine limits.
@item %c
Replace the specification with the character which is the value given.
@@ -981,17 +1017,17 @@ numbered or unnumbered format specifications but not both, except that
After the @samp{%} and any field number, you can put certain
@dfn{flag characters}.
- The flag @samp{+} inserts a plus sign before a positive number, so
+ The flag @samp{+} inserts a plus sign before a nonnegative number, so
that it always has a sign. A space character as flag inserts a space
-before a positive number. (Otherwise, positive numbers start with the
-first digit.) These flags are useful for ensuring that positive
-numbers and negative numbers use the same number of columns. They are
+before a nonnegative number. (Otherwise, nonnegative numbers start with the
+first digit.) These flags are useful for ensuring that nonnegative
+and negative numbers use the same number of columns. They are
ignored except for @samp{%d}, @samp{%e}, @samp{%f}, @samp{%g}, and if
both flags are used, @samp{+} takes precedence.
The flag @samp{#} specifies an alternate form which depends on
the format in use. For @samp{%o}, it ensures that the result begins
-with a @samp{0}. For @samp{%x} and @samp{%X}, it prefixes the result
+with a @samp{0}. For @samp{%x} and @samp{%X}, it prefixes nonzero results
with @samp{0x} or @samp{0X}. For @samp{%e} and @samp{%f}, the
@samp{#} flag means include a decimal point even if the precision is
zero. For @samp{%g}, it always includes a decimal point, and also
@@ -1074,6 +1110,17 @@ shows only the first three characters of the representation for
precision is what the local library functions of the @code{printf}
family produce.
+@cindex formatting numbers for rereading later
+ If you plan to use @code{read} later on the formatted string to
+retrieve a copy of the formatted value, use a specification that lets
+@code{read} reconstruct the value. To format numbers in this
+reversible way you can use @samp{%s} and @samp{%S}, to format just
+integers you can also use @samp{%d}, and to format just nonnegative
+integers you can also use @samp{#x%x} and @samp{#o%o}. Other formats
+may be problematic; for example, @samp{%d} and @samp{%g} can mishandle
+NaNs and can lose precision and type, and @samp{#x%x} and @samp{#o%o}
+can mishandle negative integers. @xref{Input Functions}.
+
@node Case Conversion
@section Case Conversion in Lisp
@cindex upper case