diff options
author | Mattias EngdegÄrd <mattiase@acm.org> | 2019-12-13 13:10:58 +0100 |
---|---|---|
committer | Mattias EngdegÄrd <mattiase@acm.org> | 2019-12-13 13:30:14 +0100 |
commit | 82b4e48c590cf2c0448a751e641b0ee7a6a02438 (patch) | |
tree | 55da830604ce9ebe4a5aa626bec285fb688578a3 /lisp/emacs-lisp | |
parent | b04086adf649b18cf5309dd43aa638fc7b3cd4a0 (diff) | |
download | emacs-82b4e48c590cf2c0448a751e641b0ee7a6a02438.tar.gz emacs-82b4e48c590cf2c0448a751e641b0ee7a6a02438.tar.bz2 emacs-82b4e48c590cf2c0448a751e641b0ee7a6a02438.zip |
Allow characters and single-char strings in rx charsets
The `not' and `intersection' forms, and `or' inside these forms,
now accept characters and single-character strings as arguments.
Previously, they had to be wrapped in `any' forms.
This does not add expressive power but is a convenience and is easily
understood.
* doc/lispref/searching.texi (Rx Constructs): Amend the documentation.
* etc/NEWS: Announce the change.
* lisp/emacs-lisp/rx.el (rx--charset-p, rx--translate-not)
(rx--charset-intervals, rx): Accept characters and 1-char strings in
more places.
* test/lisp/emacs-lisp/rx-tests.el (rx-not, rx-charset-or)
(rx-def-in-charset-or, rx-intersection): Test the change.
Diffstat (limited to 'lisp/emacs-lisp')
-rw-r--r-- | lisp/emacs-lisp/rx.el | 26 |
1 files changed, 20 insertions, 6 deletions
diff --git a/lisp/emacs-lisp/rx.el b/lisp/emacs-lisp/rx.el index a5cab1db888..43f7a4e2752 100644 --- a/lisp/emacs-lisp/rx.el +++ b/lisp/emacs-lisp/rx.el @@ -309,6 +309,8 @@ and set operations." (rx--every (lambda (x) (not (symbolp x))) (cdr form))) (and (memq (car form) '(not or | intersection)) (rx--every #'rx--charset-p (cdr form))))) + (characterp form) + (and (stringp form) (= (length form) 1)) (and (or (symbolp form) (consp form)) (let ((expanded (rx--expand-def form))) (and expanded @@ -521,6 +523,11 @@ If NEGATED, negate the sense (thus making it positive)." ((eq arg 'word-boundary) (rx--translate-symbol (if negated 'word-boundary 'not-word-boundary))) + ((characterp arg) + (rx--generate-alt (not negated) (list (cons arg arg)) nil)) + ((and (stringp arg) (= (length arg) 1)) + (let ((char (string-to-char arg))) + (rx--generate-alt (not negated) (list (cons char char)) nil))) ((let ((expanded (rx--expand-def arg))) (and expanded (rx--translate-not negated (list expanded))))) @@ -571,8 +578,8 @@ If NEGATED, negate the sense (thus making it positive)." (defun rx--charset-intervals (charset) "Return a sorted list of non-adjacent disjoint intervals from CHARSET. CHARSET is any expression allowed in a character set expression: -either `any' (no classes permitted), or `not', `or' or `intersection' -forms whose arguments are charsets." +characters, single-char strings, `any' forms (no classes permitted), +or `not', `or' or `intersection' forms whose arguments are charsets." (pcase charset (`(,(or 'any 'in 'char) . ,body) (let ((parsed (rx--parse-any body))) @@ -584,6 +591,11 @@ forms whose arguments are charsets." (`(not ,x) (rx--complement-intervals (rx--charset-intervals x))) (`(,(or 'or '|) . ,body) (rx--charset-union body)) (`(intersection . ,body) (rx--charset-intersection body)) + ((pred characterp) + (list (cons charset charset))) + ((guard (and (stringp charset) (= (length charset) 1))) + (let ((char (string-to-char charset))) + (list (cons char char)))) (_ (let ((expanded (rx--expand-def charset))) (if expanded (rx--charset-intervals expanded) @@ -1161,10 +1173,12 @@ CHAR Match a literal character. character, a string, a range as string \"A-Z\" or cons (?A . ?Z), or a character class (see below). Alias: in, char. (not CHARSPEC) Match one character not matched by CHARSPEC. CHARSPEC - can be (any ...), (or ...), (intersection ...), - (syntax ...), (category ...), or a character class. -(intersection CHARSET...) Intersection of CHARSETs. - CHARSET is (any...), (not...), (or...) or (intersection...). + can be a character, single-char string, (any ...), (or ...), + (intersection ...), (syntax ...), (category ...), + or a character class. +(intersection CHARSET...) Match all CHARSETs. + CHARSET is (any...), (not...), (or...) or (intersection...), + a character or a single-char string. not-newline Match any character except a newline. Alias: nonl. anychar Match any character. Alias: anything. unmatchable Never match anything at all. |