Function: rx

rx is an autoloaded macro defined in rx.el.gz.

Signature

(rx REGEXPS...)

Documentation

Translate regular expressions REGEXPS in sexp form to a regexp string.

Each argument is one of the forms below; RX is a subform, and RX... stands for zero or more RXs. For details, see Info node (elisp) Rx Notation. See rx-to-string for the corresponding function.

STRING Match a literal string.
CHAR Match a literal character.

(seq RX...) Match the RXs in sequence. Alias: :, sequence, and.
(or RX...) Match one of the RXs. Alias: |.

(zero-or-more RX...) Match RXs zero or more times. Alias: 0+.
(one-or-more RX...) Match RXs one or more times. Alias: 1+.
(zero-or-one RX...) Match RXs or the empty string. Alias: opt, optional.
(* RX...) Match RXs zero or more times; greedy.
(+ RX...) Match RXs one or more times; greedy.
(? RX...) Match RXs or the empty string; greedy.
(*? RX...) Match RXs zero or more times; non-greedy.
(+? RX...) Match RXs one or more times; non-greedy.
(?? RX...) Match RXs or the empty string; non-greedy.
(= N RX...) Match RXs exactly N times.
(>= N RX...) Match RXs N or more times.
(** N M RX...) Match RXs N to M times. Alias: repeat.
(minimal-match RX) Match RX, with zero-or-more, one-or-more, zero-or-one
                and aliases using non-greedy matching.
(maximal-match RX) Match RX, with zero-or-more, one-or-more, zero-or-one
                and aliases using greedy matching, which is the default.

(any SET...) Match a character from one of the SETs. Each SET is a
                character, a string, a range as string "A-Z" or cons
                (?A . ?Z), or a character class (see below). Alias: in, char.
(not CHARSPEC) Match one character not matched by CHARSPEC. CHARSPEC
                can be a character, single-char string, (any ...), (or ...),
                (intersection ...), (syntax ...), (category ...),
                or a character class.
(intersection CHARSET...) Match all CHARSETs.
                CHARSET is (any...), (not...), (or...) or (intersection...),
                a character or a single-char string.
not-newline Match any character except a newline. Alias: nonl.
anychar Match any character. Alias: anything.
unmatchable Never match anything at all.

CHARCLASS Match a character from a character class. One of:
 alpha, alphabetic, letter Alphabetic characters (defined by Unicode).
 alnum, alphanumeric Alphabetic or decimal digit chars (Unicode).
 digit, numeric, num 0-9.
 xdigit, hex-digit, hex 0-9, A-F, a-f.
 cntrl, control ASCII codes 0-31.
 blank Horizontal whitespace (Unicode).
 space, whitespace, white Chars with whitespace syntax.
 lower, lower-case Lower-case chars, from current case table.
 upper, upper-case Upper-case chars, from current case table.
 graph, graphic Graphic characters (Unicode).
 print, printing Whitespace or graphic (Unicode).
 punct, punctuation Not control, space, letter or digit (ASCII);
                              not word syntax (non-ASCII).
 word, wordchar Characters with word syntax.
 ascii ASCII characters (codes 0-127).
 nonascii Non-ASCII characters (but not raw bytes).

(syntax SYNTAX) Match a character with syntax SYNTAX, being one of:
  whitespace, punctuation, word, symbol, open-parenthesis,
  close-parenthesis, expression-prefix, string-quote,
  paired-delimiter, escape, character-quote, comment-start,
  comment-end, string-delimiter, comment-delimiter

(category CAT) Match a character in category CAT, being one of:
  space-for-indent, base, consonant, base-vowel,
  upper-diacritical-mark, lower-diacritical-mark, tone-mark, symbol,
  digit, vowel-modifying-diacritical-mark, vowel-sign,
  semivowel-lower, not-at-end-of-line, not-at-beginning-of-line,
  alpha-numeric-two-byte, chinese-two-byte, greek-two-byte,
  japanese-hiragana-two-byte, indian-two-byte,
  japanese-katakana-two-byte, strong-left-to-right,
  korean-hangul-two-byte, strong-right-to-left, cyrillic-two-byte,
  combining-diacritic, ascii, arabic, chinese, ethiopic, greek,
  korean, indian, japanese, japanese-katakana, latin, lao,
  tibetan, japanese-roman, thai, vietnamese, hebrew, cyrillic,
  can-break

Zero-width assertions: these all match the empty string in specific places.
 line-start At the beginning of a line. Alias: bol.
 line-end At the end of a line. Alias: eol.
 string-start At the start of the string or buffer.
                     Alias: buffer-start, bos, bot.
 string-end At the end of the string or buffer.
                     Alias: buffer-end, eos, eot.
 point At point.
 word-start At the beginning of a word. Alias: bow.
 word-end At the end of a word. Alias: eow.
 word-boundary At the beginning or end of a word.
 not-word-boundary Not at the beginning or end of a word.
 symbol-start At the beginning of a symbol.
 symbol-end At the end of a symbol.

(group RX...) Match RXs and define a capture group. Alias: submatch.
(group-n N RX...) Match RXs and define capture group N. Alias: submatch-n.
(backref N) Match the text that capture group N matched.

(literal EXPR) Match the literal string from evaluating EXPR at run time.
(regexp EXPR) Match the string regexp from evaluating EXPR at run time.
(eval EXPR) Match the rx sexp from evaluating EXPR at macro-expansion
                (compile) time.

Additional constructs can be defined using rx-define and rx-let, which see.

Other relevant functions are documented in the regexp group.

View in manual

Probably introduced at or before Emacs version 21.1.

Shortdoc

;; regexp
(rx "IP=" (+ digit) (= 3 "." (+ digit)))
    => "IP=[[:digit:]]+\\(?:\\.[[:digit:]]+\\)\\{3\\}"

Source Code

;; Defined in /usr/src/emacs/lisp/emacs-lisp/rx.el.gz
          (t (car args)))))                            ; 1 arg


;;;###autoload
(defmacro rx (&rest regexps)
  "Translate regular expressions REGEXPS in sexp form to a regexp string.
Each argument is one of the forms below; RX is a subform, and RX... stands
for zero or more RXs.  For details, see Info node `(elisp) Rx Notation'.
See `rx-to-string' for the corresponding function.

STRING         Match a literal string.
CHAR           Match a literal character.

(seq RX...)    Match the RXs in sequence.  Alias: :, sequence, and.
(or RX...)     Match one of the RXs.  Alias: |.

(zero-or-more RX...) Match RXs zero or more times.  Alias: 0+.
(one-or-more RX...)  Match RXs one or more times.  Alias: 1+.
(zero-or-one RX...)  Match RXs or the empty string.  Alias: opt, optional.
(* RX...)       Match RXs zero or more times; greedy.
(+ RX...)       Match RXs one or more times; greedy.
(? RX...)       Match RXs or the empty string; greedy.
(*? RX...)      Match RXs zero or more times; non-greedy.
(+? RX...)      Match RXs one or more times; non-greedy.
(?? RX...)      Match RXs or the empty string; non-greedy.
(= N RX...)     Match RXs exactly N times.
(>= N RX...)    Match RXs N or more times.
(** N M RX...)  Match RXs N to M times.  Alias: repeat.
(minimal-match RX)  Match RX, with zero-or-more, one-or-more, zero-or-one
                and aliases using non-greedy matching.
(maximal-match RX)  Match RX, with zero-or-more, one-or-more, zero-or-one
                and aliases using greedy matching, which is the default.

(any SET...)    Match a character from one of the SETs.  Each SET is a
                character, a string, a range as string \"A-Z\" or cons
                (?A . ?Z), or a character class (see below).  Alias: in, char.
(not CHARSPEC)  Match one character not matched by CHARSPEC.  CHARSPEC
                can be a character, single-char string, (any ...), (or ...),
                (intersection ...), (syntax ...), (category ...),
                or a character class.
(intersection CHARSET...) Match all CHARSETs.
                CHARSET is (any...), (not...), (or...) or (intersection...),
                a character or a single-char string.
not-newline     Match any character except a newline.  Alias: nonl.
anychar         Match any character.  Alias: anything.
unmatchable     Never match anything at all.

CHARCLASS       Match a character from a character class.  One of:
 alpha, alphabetic, letter   Alphabetic characters (defined by Unicode).
 alnum, alphanumeric         Alphabetic or decimal digit chars (Unicode).
 digit, numeric, num         0-9.
 xdigit, hex-digit, hex      0-9, A-F, a-f.
 cntrl, control              ASCII codes 0-31.
 blank                       Horizontal whitespace (Unicode).
 space, whitespace, white    Chars with whitespace syntax.
 lower, lower-case           Lower-case chars, from current case table.
 upper, upper-case           Upper-case chars, from current case table.
 graph, graphic              Graphic characters (Unicode).
 print, printing             Whitespace or graphic (Unicode).
 punct, punctuation          Not control, space, letter or digit (ASCII);
                              not word syntax (non-ASCII).
 word, wordchar              Characters with word syntax.
 ascii                       ASCII characters (codes 0-127).
 nonascii                    Non-ASCII characters (but not raw bytes).

(syntax SYNTAX)  Match a character with syntax SYNTAX, being one of:
  whitespace, punctuation, word, symbol, open-parenthesis,
  close-parenthesis, expression-prefix, string-quote,
  paired-delimiter, escape, character-quote, comment-start,
  comment-end, string-delimiter, comment-delimiter

(category CAT)   Match a character in category CAT, being one of:
  space-for-indent, base, consonant, base-vowel,
  upper-diacritical-mark, lower-diacritical-mark, tone-mark, symbol,
  digit, vowel-modifying-diacritical-mark, vowel-sign,
  semivowel-lower, not-at-end-of-line, not-at-beginning-of-line,
  alpha-numeric-two-byte, chinese-two-byte, greek-two-byte,
  japanese-hiragana-two-byte, indian-two-byte,
  japanese-katakana-two-byte, strong-left-to-right,
  korean-hangul-two-byte, strong-right-to-left, cyrillic-two-byte,
  combining-diacritic, ascii, arabic, chinese, ethiopic, greek,
  korean, indian, japanese, japanese-katakana, latin, lao,
  tibetan, japanese-roman, thai, vietnamese, hebrew, cyrillic,
  can-break

Zero-width assertions: these all match the empty string in specific places.
 line-start         At the beginning of a line.  Alias: bol.
 line-end           At the end of a line.  Alias: eol.
 string-start       At the start of the string or buffer.
                     Alias: buffer-start, bos, bot.
 string-end         At the end of the string or buffer.
                     Alias: buffer-end, eos, eot.
 point              At point.
 word-start         At the beginning of a word.  Alias: bow.
 word-end           At the end of a word.  Alias: eow.
 word-boundary      At the beginning or end of a word.
 not-word-boundary  Not at the beginning or end of a word.
 symbol-start       At the beginning of a symbol.
 symbol-end         At the end of a symbol.

(group RX...)  Match RXs and define a capture group.  Alias: submatch.
(group-n N RX...) Match RXs and define capture group N.  Alias: submatch-n.
(backref N)    Match the text that capture group N matched.

(literal EXPR) Match the literal string from evaluating EXPR at run time.
(regexp EXPR)  Match the string regexp from evaluating EXPR at run time.
(eval EXPR)    Match the rx sexp from evaluating EXPR at macro-expansion
                (compile) time.

Additional constructs can be defined using `rx-define' and `rx-let',
which see.

\(fn REGEXPS...)"
  (rx--to-expr (cons 'seq regexps)))