Add rx extension mechanism
Add a built-in set of extension macros: `rx-define', `rx-let' and `rx-let-eval'. * lisp/emacs-lisp/rx.el (rx-constituents, rx-to-string): Doc updates. (rx--builtin-symbols, rx--builtin-names, rx--local-definitions) (rx--lookup-def, rx--substitute, rx--expand-template) (rx--make-binding, rx--make-named-binding, rx--extend-local-defs) (rx-let-eval, rx-let, rx-define): New. (rx--translate-symbol, rx--translate-form): Use extensions if any. (rx): Use local definitions. * test/lisp/emacs-lisp/rx-tests.el (rx-let, rx-define) (rx-to-string-define, rx-let-define, rx-let-eval): New. * etc/NEWS (Changes in Specialized Modes and Packages): * doc/lispref/searching.texi (Rx Notation, Rx Functions, Extending Rx): Add node about rx extensions.
This commit is contained in:
parent
2ed71227c6
commit
07367e5b95
4 changed files with 538 additions and 20 deletions
|
@ -1037,6 +1037,7 @@ customisation.
|
|||
@menu
|
||||
* Rx Constructs:: Constructs valid in rx forms.
|
||||
* Rx Functions:: Functions and macros that use rx forms.
|
||||
* Extending Rx:: How to define your own rx forms.
|
||||
@end menu
|
||||
|
||||
@node Rx Constructs
|
||||
|
@ -1524,6 +1525,162 @@ must be string literals.
|
|||
|
||||
The @code{pcase} macro can use @code{rx} expressions as patterns
|
||||
directly; @pxref{rx in pcase}.
|
||||
|
||||
For mechanisms to add user-defined extensions to the @code{rx}
|
||||
notation, @pxref{Extending Rx}.
|
||||
|
||||
@node Extending Rx
|
||||
@subsubsection Defining new @code{rx} forms
|
||||
|
||||
The @code{rx} notation can be extended by defining new symbols and
|
||||
parametrised forms in terms of other @code{rx} expressions. This is
|
||||
handy for sharing parts between several regexps, and for making
|
||||
complex ones easier to build and understand by putting them together
|
||||
from smaller pieces.
|
||||
|
||||
For example, you could define @code{name} to mean
|
||||
@code{(one-or-more letter)}, and @code{(quoted @var{x})} to mean
|
||||
@code{(seq ?' @var{x} ?')} for any @var{x}. These forms could then be
|
||||
used in @code{rx} expressions like any other: @code{(rx (quoted name))}
|
||||
would match a nonempty sequence of letters inside single quotes.
|
||||
|
||||
The Lisp macros below provide different ways of binding names to
|
||||
definitions. Common to all of them are the following rules:
|
||||
|
||||
@itemize
|
||||
@item
|
||||
Built-in @code{rx} forms, like @code{digit} and @code{group}, cannot
|
||||
be redefined.
|
||||
|
||||
@item
|
||||
The definitions live in a name space of their own, separate from that
|
||||
of Lisp variables. There is thus no need to attach a suffix like
|
||||
@code{-regexp} to names; they cannot collide with anything else.
|
||||
|
||||
@item
|
||||
Definitions cannot refer to themselves recursively, directly or
|
||||
indirectly. If you find yourself needing this, you want a parser, not
|
||||
a regular expression.
|
||||
|
||||
@item
|
||||
Definitions are only ever expanded in calls to @code{rx} or
|
||||
@code{rx-to-string}, not merely by their presence in definition
|
||||
macros. This means that the order of definitions doesn't matter, even
|
||||
when they refer to each other, and that syntax errors only show up
|
||||
when they are used, not when they are defined.
|
||||
|
||||
@item
|
||||
User-defined forms are allowed wherever arbitrary @code{rx}
|
||||
expressions are expected; for example, in the body of a
|
||||
@code{zero-or-one} form, but not inside @code{any} or @code{category}
|
||||
forms.
|
||||
@end itemize
|
||||
|
||||
@defmac rx-define name [arglist] rx-form
|
||||
Define @var{name} globally in all subsequent calls to @code{rx} and
|
||||
@code{rx-to-string}. If @var{arglist} is absent, then @var{name} is
|
||||
defined as a plain symbol to be replaced with @var{rx-form}. Example:
|
||||
|
||||
@example
|
||||
@group
|
||||
(rx-define haskell-comment (seq "--" (zero-or-more nonl)))
|
||||
(rx haskell-comment)
|
||||
@result{} "--.*"
|
||||
@end group
|
||||
@end example
|
||||
|
||||
If @var{arglist} is present, it must be a list of zero or more
|
||||
argument names, and @var{name} is then defined as a parametrised form.
|
||||
When used in an @code{rx} expression as @code{(@var{name} @var{arg}@dots{})},
|
||||
each @var{arg} will replace the corresponding argument name inside
|
||||
@var{rx-form}.
|
||||
|
||||
@var{arglist} may end in @code{&rest} and one final argument name,
|
||||
denoting a rest parameter. The rest parameter will expand to all
|
||||
extra actual argument values not matched by any other parameter in
|
||||
@var{arglist}, spliced into @var{rx-form} where it occurs. Example:
|
||||
|
||||
@example
|
||||
@group
|
||||
(rx-define moan (x y &rest r) (seq x (one-or-more y) r "!"))
|
||||
(rx (moan "MOO" "A" "MEE" "OW"))
|
||||
@result{} "MOOA+MEEOW!"
|
||||
@end group
|
||||
@end example
|
||||
|
||||
Since the definition is global, it is recommended to give @var{name} a
|
||||
package prefix to avoid name clashes with definitions elsewhere, as is
|
||||
usual when naming non-local variables and functions.
|
||||
@end defmac
|
||||
|
||||
@defmac rx-let (bindings@dots{}) body@dots{}
|
||||
Make the @code{rx} definitions in @var{bindings} available locally for
|
||||
@code{rx} macro invocations in @var{body}, which is then evaluated.
|
||||
|
||||
Each element of @var{bindings} is on the form
|
||||
@w{@code{(@var{name} [@var{arglist}] @var{rx-form})}}, where the parts
|
||||
have the same meaning as in @code{rx-define} above. Example:
|
||||
|
||||
@example
|
||||
@group
|
||||
(rx-let ((comma-separated (item) (seq item (0+ "," item)))
|
||||
(number (1+ digit))
|
||||
(numbers (comma-separated number)))
|
||||
(re-search-forward (rx "(" numbers ")")))
|
||||
@end group
|
||||
@end example
|
||||
|
||||
The definitions are only available during the macro-expansion of
|
||||
@var{body}, and are thus not present during execution of compiled
|
||||
code.
|
||||
|
||||
@code{rx-let} can be used not only inside a function, but also at top
|
||||
level to include global variable and function definitions that need
|
||||
to share a common set of @code{rx} forms. Since the names are local
|
||||
inside @var{body}, there is no need for any package prefixes.
|
||||
Example:
|
||||
|
||||
@example
|
||||
@group
|
||||
(rx-let ((phone-number (seq (opt ?+) (1+ (any digit ?-)))))
|
||||
(defun find-next-phone-number ()
|
||||
(re-search-forward (rx phone-number)))
|
||||
(defun phone-number-p (string)
|
||||
(string-match-p (rx bos phone-number eos) string)))
|
||||
@end group
|
||||
@end example
|
||||
|
||||
The scope of the @code{rx-let} bindings is lexical, which means that
|
||||
they are not visible outside @var{body} itself, even in functions
|
||||
called from @var{body}.
|
||||
@end defmac
|
||||
|
||||
@defmac rx-let-eval bindings body@dots{}
|
||||
Evaluate @var{bindings} to a list of bindings as in @code{rx-let},
|
||||
and evaluate @var{body} with those bindings in effect for calls
|
||||
to @code{rx-to-string}.
|
||||
|
||||
This macro is similar to @code{rx-let}, except that the @var{bindings}
|
||||
argument is evaluated (and thus needs to be quoted if it is a list
|
||||
literal), and the definitions are substituted at run time, which is
|
||||
required for @code{rx-to-string} to work. Example:
|
||||
|
||||
@example
|
||||
@group
|
||||
(rx-let-eval
|
||||
'((ponder (x) (seq "Where have all the " x " gone?")))
|
||||
(looking-at (rx-to-string
|
||||
'(ponder (or "flowers" "young girls"
|
||||
"left socks")))))
|
||||
@end group
|
||||
@end example
|
||||
|
||||
Another difference from @code{rx-let} is that the @var{bindings} are
|
||||
dynamically scoped, and thus also available in functions called from
|
||||
@var{body}. However, they are not visible inside functions defined in
|
||||
@var{body}.
|
||||
@end defmac
|
||||
|
||||
@end ifnottex
|
||||
|
||||
@node Regexp Functions
|
||||
|
|
4
etc/NEWS
4
etc/NEWS
|
@ -1725,6 +1725,10 @@ This also works for their aliases: '|' for 'or'; ':', 'and' and
|
|||
In this case, 'rx' will generate code which produces a regexp string
|
||||
at run time, instead of a constant string.
|
||||
|
||||
---
|
||||
*** New rx extension mechanism: 'rx-define', 'rx-let', 'rx-let-eval'.
|
||||
These macros add new forms to the rx notation.
|
||||
|
||||
** Frames
|
||||
|
||||
+++
|
||||
|
|
|
@ -97,6 +97,7 @@ Most of the names are from SRE.")
|
|||
|
||||
(defvar rx-constituents nil
|
||||
"Alist of old-style rx extensions, for compatibility.
|
||||
For new code, use `rx-define', `rx-let' or `rx-let-eval'.
|
||||
|
||||
Each element is (SYMBOL . DEF).
|
||||
|
||||
|
@ -113,6 +114,17 @@ If DEF is a list on the form (FUN MIN-ARGS MAX-ARGS PRED), then
|
|||
If PRED is non-nil, it is a predicate that all actual arguments must
|
||||
satisfy.")
|
||||
|
||||
(defvar rx--local-definitions nil
|
||||
"Alist of dynamic local rx definitions.
|
||||
Each entry is:
|
||||
(NAME DEF) -- NAME is an rx symbol defined as the rx form DEF.
|
||||
(NAME ARGS DEF) -- NAME is an rx form with arglist ARGS, defined
|
||||
as the rx form DEF (which can contain members of ARGS).")
|
||||
|
||||
(defsubst rx--lookup-def (name)
|
||||
(or (cdr (assq name rx--local-definitions))
|
||||
(get name 'rx-definition)))
|
||||
|
||||
;; TODO: Additions to consider:
|
||||
;; - A better name for `anything', like `any-char' or `anychar'.
|
||||
;; - A name for (or), maybe `unmatchable'.
|
||||
|
@ -144,6 +156,12 @@ If DEF is a list on the form (FUN MIN-ARGS MAX-ARGS PRED), then
|
|||
((let ((class (cdr (assq sym rx--char-classes))))
|
||||
(and class (cons (list (concat "[[:" (symbol-name class) ":]]")) t))))
|
||||
|
||||
((let ((definition (rx--lookup-def sym)))
|
||||
(and definition
|
||||
(if (cdr definition)
|
||||
(error "Not an `rx' symbol definition: %s" sym)
|
||||
(rx--translate (nth 0 definition))))))
|
||||
|
||||
;; For compatibility with old rx.
|
||||
((let ((entry (assq sym rx-constituents)))
|
||||
(and (progn
|
||||
|
@ -310,6 +328,19 @@ INTERVALS is a list of (START . END) with START ≤ END, sorted by START."
|
|||
(setq tail d)))
|
||||
intervals))
|
||||
|
||||
;; FIXME: Consider expanding definitions inside (any ...) and (not ...),
|
||||
;; and perhaps allow (any ...) inside (any ...).
|
||||
;; It would be benefit composability (build a character alternative by pieces)
|
||||
;; and be handy for obtaining the complement of a defined set of
|
||||
;; characters. (See, for example, python.el:421, `not-simple-operator'.)
|
||||
;; (Expansion in other non-rx positions is probably not a good idea:
|
||||
;; syntax, category, backref, and the integer parameters of group-n,
|
||||
;; =, >=, **, repeat)
|
||||
;; Similar effect could be attained by ensuring that
|
||||
;; (or (any X) (any Y)) -> (any X Y), and find a way to compose negative
|
||||
;; sets. `and' is taken, but we could add
|
||||
;; (intersection (not (any X)) (not (any Y))) -> (not (any X Y)).
|
||||
|
||||
(defun rx--translate-any (negated body)
|
||||
"Translate an (any ...) construct. Return (REGEXP . PRECEDENCE).
|
||||
If NEGATED, negate the sense."
|
||||
|
@ -712,6 +743,94 @@ DEF is the definition tuple. Return (REGEXP . PRECEDENCE)."
|
|||
(error "The `%s' form did not expand to a string" (car form)))
|
||||
(cons (list regexp) nil))))
|
||||
|
||||
(defun rx--substitute (bindings form)
|
||||
"Substitute BINDINGS in FORM. BINDINGS is an alist of (NAME . VALUES)
|
||||
where VALUES is a list to splice into FORM wherever NAME occurs.
|
||||
Return the substitution result wrapped in a list, since a single value
|
||||
can expand to any number of values."
|
||||
(cond ((symbolp form)
|
||||
(let ((binding (assq form bindings)))
|
||||
(if binding
|
||||
(cdr binding)
|
||||
(list form))))
|
||||
((consp form)
|
||||
(if (listp (cdr form))
|
||||
;; Proper list. We substitute variables even in the head
|
||||
;; position -- who knows, might be handy one day.
|
||||
(list (mapcan (lambda (x) (copy-sequence
|
||||
(rx--substitute bindings x)))
|
||||
form))
|
||||
;; Cons pair (presumably an interval).
|
||||
(let ((first (rx--substitute bindings (car form)))
|
||||
(second (rx--substitute bindings (cdr form))))
|
||||
(if (and first (not (cdr first))
|
||||
second (not (cdr second)))
|
||||
(list (cons (car first) (car second)))
|
||||
(error
|
||||
"Cannot substitute a &rest parameter into a dotted pair")))))
|
||||
(t (list form))))
|
||||
|
||||
;; FIXME: Consider adding extensions in Lisp macro style, where
|
||||
;; arguments are passed unevaluated to code that returns the rx form
|
||||
;; to use. Example:
|
||||
;;
|
||||
;; (rx-let ((radix-digit (radix)
|
||||
;; :lisp (list 'any (cons ?0 (+ ?0 (eval radix) -1)))))
|
||||
;; (rx (radix-digit (+ 5 3))))
|
||||
;; =>
|
||||
;; "[0-7]"
|
||||
;;
|
||||
;; While this would permit more powerful extensions, it's unclear just
|
||||
;; how often they would be used in practice. Let's wait until there is
|
||||
;; demand for it.
|
||||
|
||||
;; FIXME: An alternative binding syntax would be
|
||||
;;
|
||||
;; (NAME RXs...)
|
||||
;; and
|
||||
;; ((NAME ARGS...) RXs...)
|
||||
;;
|
||||
;; which would have two minor advantages: multiple RXs with implicit
|
||||
;; `seq' in the definition, and the arglist is no longer an optional
|
||||
;; element in the middle of the list. On the other hand, it's less
|
||||
;; like traditional lisp arglist constructs (defun, defmacro).
|
||||
;; Since it's a Scheme-like syntax, &rest parameters could be done using
|
||||
;; dotted lists:
|
||||
;; (rx-let (((name arg1 arg2 . rest) ...definition...)) ...)
|
||||
|
||||
(defun rx--expand-template (op values arglist template)
|
||||
"Return TEMPLATE with variables in ARGLIST replaced with VALUES."
|
||||
(let ((bindings nil)
|
||||
(value-tail values)
|
||||
(formals arglist))
|
||||
(while formals
|
||||
(pcase (car formals)
|
||||
('&rest
|
||||
(unless (cdr formals)
|
||||
(error
|
||||
"Expanding rx def `%s': missing &rest parameter name" op))
|
||||
(push (cons (cadr formals) value-tail) bindings)
|
||||
(setq formals nil)
|
||||
(setq value-tail nil))
|
||||
(name
|
||||
(unless value-tail
|
||||
(error
|
||||
"Expanding rx def `%s': too few arguments (got %d, need %s%d)"
|
||||
op (length values)
|
||||
(if (memq '&rest arglist) "at least " "")
|
||||
(- (length arglist) (length (memq '&rest arglist)))))
|
||||
(push (cons name (list (car value-tail))) bindings)
|
||||
(setq value-tail (cdr value-tail))))
|
||||
(setq formals (cdr formals)))
|
||||
(when value-tail
|
||||
(error
|
||||
"Expanding rx def `%s': too many arguments (got %d, need %d)"
|
||||
op (length values) (length arglist)))
|
||||
(let ((subst (rx--substitute bindings template)))
|
||||
(if (and subst (not (cdr subst)))
|
||||
(car subst)
|
||||
(error "Expanding rx def `%s': must result in a single value" op)))))
|
||||
|
||||
(defun rx--translate-form (form)
|
||||
"Translate an rx form (list structure). Return (REGEXP . PRECEDENCE)."
|
||||
(let ((body (cdr form)))
|
||||
|
@ -757,24 +876,29 @@ DEF is the definition tuple. Return (REGEXP . PRECEDENCE)."
|
|||
(op
|
||||
(unless (symbolp op)
|
||||
(error "Bad rx operator `%S'" op))
|
||||
(let ((definition (rx--lookup-def op)))
|
||||
(if definition
|
||||
(if (cdr definition)
|
||||
(rx--translate
|
||||
(rx--expand-template
|
||||
op body (nth 0 definition) (nth 1 definition)))
|
||||
(error "Not an `rx' form definition: %s" op))
|
||||
|
||||
;; For compatibility with old rx.
|
||||
(let ((entry (assq op rx-constituents)))
|
||||
(if (progn
|
||||
(while (and entry (not (consp (cdr entry))))
|
||||
(setq entry
|
||||
(if (symbolp (cdr entry))
|
||||
;; Alias for another entry.
|
||||
(assq (cdr entry) rx-constituents)
|
||||
;; Wrong type, try further down the list.
|
||||
(assq (car entry)
|
||||
(cdr (memq entry rx-constituents))))))
|
||||
entry)
|
||||
(rx--translate-compat-form (cdr entry) form)
|
||||
(error "Unknown rx form `%s'" op)))))))
|
||||
;; For compatibility with old rx.
|
||||
(let ((entry (assq op rx-constituents)))
|
||||
(if (progn
|
||||
(while (and entry (not (consp (cdr entry))))
|
||||
(setq entry
|
||||
(if (symbolp (cdr entry))
|
||||
;; Alias for another entry.
|
||||
(assq (cdr entry) rx-constituents)
|
||||
;; Wrong type, try further down the list.
|
||||
(assq (car entry)
|
||||
(cdr (memq entry rx-constituents))))))
|
||||
entry)
|
||||
(rx--translate-compat-form (cdr entry) form)
|
||||
(error "Unknown rx form `%s'" op)))))))))
|
||||
|
||||
;; Defined here rather than in re-builder to lower the odds that it
|
||||
;; will be kept in sync with changes.
|
||||
(defconst rx--builtin-forms
|
||||
'(seq sequence : and or | any in char not-char not
|
||||
repeat = >= **
|
||||
|
@ -786,7 +910,21 @@ DEF is the definition tuple. Return (REGEXP . PRECEDENCE)."
|
|||
group submatch group-n submatch-n backref
|
||||
syntax not-syntax category
|
||||
literal eval regexp regex)
|
||||
"List of built-in rx forms. For use in re-builder only.")
|
||||
"List of built-in rx function-like symbols.")
|
||||
|
||||
(defconst rx--builtin-symbols
|
||||
(append '(nonl not-newline any anything
|
||||
bol eol line-start line-end
|
||||
bos eos string-start string-end
|
||||
bow eow word-start word-end
|
||||
symbol-start symbol-end
|
||||
point word-boundary not-word-boundary not-wordchar)
|
||||
(mapcar #'car rx--char-classes))
|
||||
"List of built-in rx variable-like symbols.")
|
||||
|
||||
(defconst rx--builtin-names
|
||||
(append rx--builtin-forms rx--builtin-symbols)
|
||||
"List of built-in rx names. These cannot be redefined by the user.")
|
||||
|
||||
(defun rx--translate (item)
|
||||
"Translate the rx-expression ITEM. Return (REGEXP . PRECEDENCE)."
|
||||
|
@ -810,7 +948,9 @@ DEF is the definition tuple. Return (REGEXP . PRECEDENCE)."
|
|||
The arguments to `literal' and `regexp' forms inside FORM must be
|
||||
constant strings.
|
||||
If NO-GROUP is non-nil, don't bracket the result in a non-capturing
|
||||
group."
|
||||
group.
|
||||
|
||||
For extending the `rx' notation in FORM, use `rx-define' or `rx-let-eval'."
|
||||
(let* ((item (rx--translate form))
|
||||
(exprs (if no-group
|
||||
(car item)
|
||||
|
@ -939,14 +1079,133 @@ Zero-width assertions: these all match the empty string in specific places.
|
|||
(regexp EXPR) Match the string regexp from evaluating EXPR at run time.
|
||||
(eval EXPR) Match the rx sexp from evaluating EXPR at compile time.
|
||||
|
||||
\(fn REGEXPS...)"
|
||||
(rx--to-expr (cons 'seq regexps)))
|
||||
Additional constructs can be defined using `rx-define' and `rx-let',
|
||||
which see.
|
||||
|
||||
\(fn REGEXPS...)"
|
||||
;; Retrieve local definitions from the macroexpansion environment.
|
||||
;; (It's unclear whether the previous value of `rx--local-definitions'
|
||||
;; should be included, and if so, in which order.)
|
||||
(let ((rx--local-definitions
|
||||
(cdr (assq :rx-locals macroexpand-all-environment))))
|
||||
(rx--to-expr (cons 'seq regexps))))
|
||||
|
||||
(defun rx--make-binding (name tail)
|
||||
"Make a definitions entry out of TAIL.
|
||||
TAIL is on the form ([ARGLIST] DEFINITION)."
|
||||
(unless (symbolp name)
|
||||
(error "Bad `rx' definition name: %S" name))
|
||||
;; FIXME: Consider using a hash table or symbol property, for speed.
|
||||
(when (memq name rx--builtin-names)
|
||||
(error "Cannot redefine built-in rx name `%s'" name))
|
||||
(pcase tail
|
||||
(`(,def)
|
||||
(list def))
|
||||
(`(,args ,def)
|
||||
(unless (and (listp args) (rx--every #'symbolp args))
|
||||
(error "Bad argument list for `rx' definition %s: %S" name args))
|
||||
(list args def))
|
||||
(_ (error "Bad `rx' definition of %s: %S" name tail))))
|
||||
|
||||
(defun rx--make-named-binding (bindspec)
|
||||
"Make a definitions entry out of BINDSPEC.
|
||||
BINDSPEC is on the form (NAME [ARGLIST] DEFINITION)."
|
||||
(unless (consp bindspec)
|
||||
(error "Bad `rx-let' binding: %S" bindspec))
|
||||
(cons (car bindspec)
|
||||
(rx--make-binding (car bindspec) (cdr bindspec))))
|
||||
|
||||
(defun rx--extend-local-defs (bindspecs)
|
||||
(append (mapcar #'rx--make-named-binding bindspecs)
|
||||
rx--local-definitions))
|
||||
|
||||
;;;###autoload
|
||||
(defmacro rx-let-eval (bindings &rest body)
|
||||
"Evaluate BODY with local BINDINGS for `rx-to-string'.
|
||||
BINDINGS, after evaluation, is a list of definitions each on the form
|
||||
(NAME [(ARGS...)] RX), in effect for calls to `rx-to-string'
|
||||
in BODY.
|
||||
|
||||
For bindings without an ARGS list, NAME is defined as an alias
|
||||
for the `rx' expression RX. Where ARGS is supplied, NAME is
|
||||
defined as an `rx' form with ARGS as argument list. The
|
||||
parameters are bound from the values in the (NAME ...) form and
|
||||
are substituted in RX. ARGS can contain `&rest' parameters,
|
||||
whose values are spliced into RX where the parameter name occurs.
|
||||
|
||||
Any previous definitions with the same names are shadowed during
|
||||
the expansion of BODY only.
|
||||
For extensions when using the `rx' macro, use `rx-let'.
|
||||
To make global rx extensions, use `rx-define'.
|
||||
For more details, see Info node `(elisp) Extending Rx'.
|
||||
|
||||
\(fn BINDINGS BODY...)"
|
||||
(declare (indent 1) (debug (form body)))
|
||||
;; FIXME: this way, `rx--extend-local-defs' may need to be autoloaded.
|
||||
`(let ((rx--local-definitions (rx--extend-local-defs ,bindings)))
|
||||
,@body))
|
||||
|
||||
;;;###autoload
|
||||
(defmacro rx-let (bindings &rest body)
|
||||
"Evaluate BODY with local BINDINGS for `rx'.
|
||||
BINDINGS is an unevaluated list of bindings each on the form
|
||||
(NAME [(ARGS...)] RX).
|
||||
They are bound lexically and are available in `rx' expressions in
|
||||
BODY only.
|
||||
|
||||
For bindings without an ARGS list, NAME is defined as an alias
|
||||
for the `rx' expression RX. Where ARGS is supplied, NAME is
|
||||
defined as an `rx' form with ARGS as argument list. The
|
||||
parameters are bound from the values in the (NAME ...) form and
|
||||
are substituted in RX. ARGS can contain `&rest' parameters,
|
||||
whose values are spliced into RX where the parameter name occurs.
|
||||
|
||||
Any previous definitions with the same names are shadowed during
|
||||
the expansion of BODY only.
|
||||
For local extensions to `rx-to-string', use `rx-let-eval'.
|
||||
To make global rx extensions, use `rx-define'.
|
||||
For more details, see Info node `(elisp) Extending Rx'.
|
||||
|
||||
\(fn BINDINGS BODY...)"
|
||||
(declare (indent 1) (debug (sexp body)))
|
||||
(let ((prev-locals (cdr (assq :rx-locals macroexpand-all-environment)))
|
||||
(new-locals (mapcar #'rx--make-named-binding bindings)))
|
||||
(macroexpand-all (cons 'progn body)
|
||||
(cons (cons :rx-locals (append new-locals prev-locals))
|
||||
macroexpand-all-environment))))
|
||||
|
||||
;;;###autoload
|
||||
(defmacro rx-define (name &rest definition)
|
||||
"Define NAME as a global `rx' definition.
|
||||
If the ARGS list is omitted, define NAME as an alias for the `rx'
|
||||
expression RX.
|
||||
|
||||
If the ARGS list is supplied, define NAME as an `rx' form with
|
||||
ARGS as argument list. The parameters are bound from the values
|
||||
in the (NAME ...) form and are substituted in RX.
|
||||
ARGS can contain `&rest' parameters, whose values are spliced
|
||||
into RX where the parameter name occurs.
|
||||
|
||||
Any previous global definition of NAME is overwritten with the new one.
|
||||
To make local rx extensions, use `rx-let' for `rx',
|
||||
`rx-let-eval' for `rx-to-string'.
|
||||
For more details, see Info node `(elisp) Extending Rx'.
|
||||
|
||||
\(fn NAME [(ARGS...)] RX)"
|
||||
(declare (indent 1))
|
||||
`(eval-and-compile
|
||||
(put ',name 'rx-definition ',(rx--make-binding name definition))
|
||||
',name))
|
||||
|
||||
;; During `rx--pcase-transform', list of defined variables in right-to-left
|
||||
;; order.
|
||||
(defvar rx--pcase-vars)
|
||||
|
||||
;; FIXME: The rewriting strategy for pcase works so-so with extensions;
|
||||
;; definitions cannot expand to `let' or named `backref'. If this ever
|
||||
;; becomes a problem, we can handle those forms in the ordinary parser,
|
||||
;; using a dynamic variable for activating the augmented forms.
|
||||
|
||||
(defun rx--pcase-transform (rx)
|
||||
"Transform RX, an rx-expression augmented with `let' and named `backref',
|
||||
into a plain rx-expression, collecting names into `rx--pcase-vars'."
|
||||
|
|
|
@ -303,6 +303,104 @@
|
|||
(should (equal (rx-to-string '(or nonl "\nx") t)
|
||||
".\\|\nx")))
|
||||
|
||||
(ert-deftest rx-let ()
|
||||
(rx-let ((beta gamma)
|
||||
(gamma delta)
|
||||
(delta (+ digit))
|
||||
(epsilon (or gamma nonl)))
|
||||
(should (equal (rx bol delta epsilon)
|
||||
"^[[:digit:]]+\\(?:[[:digit:]]+\\|.\\)")))
|
||||
(rx-let ((p () point)
|
||||
(separated (x sep) (seq x (* sep x)))
|
||||
(comma-separated (x) (separated x ","))
|
||||
(semi-separated (x) (separated x ";"))
|
||||
(matrix (v) (semi-separated (comma-separated v))))
|
||||
(should (equal (rx (p) (matrix (+ "a")) eos)
|
||||
"\\=a+\\(?:,a+\\)*\\(?:;a+\\(?:,a+\\)*\\)*\\'")))
|
||||
(rx-let ((b bol)
|
||||
(z "B")
|
||||
(three (x) (= 3 x)))
|
||||
(rx-let ((two (x) (seq x x))
|
||||
(z "A")
|
||||
(e eol))
|
||||
(should (equal (rx b (two (three z)) e)
|
||||
"^A\\{3\\}A\\{3\\}$"))))
|
||||
(rx-let ((f (a b &rest r) (seq "<" a ";" b ":" r ">")))
|
||||
(should (equal (rx bol (f ?x ?y) ?! (f ?u ?v ?w) ?! (f ?k ?l ?m ?n) eol)
|
||||
"^<x;y:>!<u;v:w>!<k;l:mn>$")))
|
||||
|
||||
;; Rest parameters are expanded by splicing.
|
||||
(rx-let ((f (&rest r) (or bol r eol)))
|
||||
(should (equal (rx (f "ab" nonl))
|
||||
"^\\|ab\\|.\\|$")))
|
||||
|
||||
;; Substitution is done in number positions.
|
||||
(rx-let ((stars (n) (= n ?*)))
|
||||
(should (equal (rx (stars 4))
|
||||
"\\*\\{4\\}")))
|
||||
|
||||
;; Substitution is done inside dotted pairs.
|
||||
(rx-let ((f (x y z) (any x (y . z))))
|
||||
(should (equal (rx (f ?* ?a ?t))
|
||||
"[*a-t]")))
|
||||
|
||||
;; Substitution is done in the head position of forms.
|
||||
(rx-let ((f (x) (x "a")))
|
||||
(should (equal (rx (f +))
|
||||
"a+"))))
|
||||
|
||||
(ert-deftest rx-define ()
|
||||
(rx-define rx--a (seq "x" (opt "y")))
|
||||
(should (equal (rx bol rx--a eol)
|
||||
"^xy?$"))
|
||||
(rx-define rx--c (lb rb &rest stuff) (seq lb stuff rb))
|
||||
(should (equal (rx bol (rx--c "<" ">" rx--a nonl) eol)
|
||||
"^<xy?.>$"))
|
||||
(rx-define rx--b (* rx--a))
|
||||
(should (equal (rx rx--b)
|
||||
"\\(?:xy?\\)*"))
|
||||
(rx-define rx--a "z")
|
||||
(should (equal (rx rx--b)
|
||||
"z*")))
|
||||
|
||||
(defun rx--test-rx-to-string-define ()
|
||||
;; `rx-define' won't expand to code inside `ert-deftest' since we use
|
||||
;; `eval-and-compile'. Put it into a defun as a workaround.
|
||||
(rx-define rx--d "Q")
|
||||
(rx-to-string '(seq bol rx--d) t))
|
||||
|
||||
(ert-deftest rx-to-string-define ()
|
||||
"Check that `rx-to-string' uses definitions made by `rx-define'."
|
||||
(should (equal (rx--test-rx-to-string-define)
|
||||
"^Q")))
|
||||
|
||||
(ert-deftest rx-let-define ()
|
||||
"Test interaction between `rx-let' and `rx-define'."
|
||||
(rx-define rx--e "one")
|
||||
(rx-define rx--f "eins")
|
||||
(rx-let ((rx--e "two"))
|
||||
(should (equal (rx rx--e nonl rx--f) "two.eins"))
|
||||
(rx-define rx--e "three")
|
||||
(should (equal (rx rx--e) "two"))
|
||||
(rx-define rx--f "zwei")
|
||||
(should (equal (rx rx--f) "zwei")))
|
||||
(should (equal (rx rx--e nonl rx--f) "three.zwei")))
|
||||
|
||||
(ert-deftest rx-let-eval ()
|
||||
(rx-let-eval '((a (* digit))
|
||||
(f (x &rest r) (seq x nonl r)))
|
||||
(should (equal (rx-to-string '(seq a (f bow a ?b)) t)
|
||||
"[[:digit:]]*\\<.[[:digit:]]*b"))))
|
||||
|
||||
(ert-deftest rx-redefine-builtin ()
|
||||
(should-error (rx-define sequence () "x"))
|
||||
(should-error (rx-define sequence "x"))
|
||||
(should-error (rx-define nonl () "x"))
|
||||
(should-error (rx-define nonl "x"))
|
||||
(should-error (rx-let ((punctuation () "x")) nil))
|
||||
(should-error (rx-let ((punctuation "x")) nil))
|
||||
(should-error (rx-let-eval '((not-char () "x")) nil))
|
||||
(should-error (rx-let-eval '((not-char "x")) nil)))
|
||||
|
||||
(ert-deftest rx-constituents ()
|
||||
(let ((rx-constituents
|
||||
|
|
Loading…
Add table
Reference in a new issue