Variable: rx--char-classes

rx--char-classes is a variable defined in rx.el.gz.

Value

((digit . digit) (numeric . digit) (num . digit) (control . cntrl)
 (cntrl . cntrl) (hex-digit . xdigit) (hex . xdigit) (xdigit . xdigit)
 (blank . blank) (graphic . graph) (graph . graph) (printing . print)
 (print . print) (alphanumeric . alnum) (alnum . alnum)
 (letter . alpha) (alphabetic . alpha) (alpha . alpha) (ascii . ascii)
 (nonascii . nonascii) (lower . lower) (lower-case . lower)
 (punctuation . punct) (punct . punct) (space . space)
 (whitespace . space) (white . space) (upper . upper)
 (upper-case . upper) (word . word) (wordchar . word)
 (unibyte . unibyte) (multibyte . multibyte))

Documentation

Alist mapping rx symbols to character classes.

Most of the names are from SRE.

Source Code

;; Defined in /usr/src/emacs/lisp/emacs-lisp/rx.el.gz
;;; rx.el --- S-exp notation for regexps           --*- lexical-binding: t -*-

;; Copyright (C) 2001-2025 Free Software Foundation, Inc.

;; This file is part of GNU Emacs.

;; GNU Emacs is free software: you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.

;; GNU Emacs is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.

;;; Commentary:

;; This facility allows writing regexps in a sexp-based language
;; instead of strings.  Regexps in the `rx' notation are easier to
;; read, write and maintain; they can be indented and commented in a
;; natural way, and are easily composed by program code.
;; The translation to string regexp is done by a macro and does not
;; incur any extra processing during run time.  Example:
;;
;;  (rx bos (or (not "^")
;;              (seq "^" (or " *" "["))))
;;
;; => "\\`\\(?:[^^]\\|\\^\\(?: \\*\\|\\[\\)\\)"
;;
;; The notation is much influenced by and retains some compatibility with
;; Olin Shivers's SRE, with concessions to Emacs regexp peculiarities,
;; and the older Emacs package Sregex.

;;; Legacy syntax still accepted by rx:
;;
;; These are constructs from earlier rx and sregex implementations
;; that were mistakes, accidents or just not very good ideas in hindsight.

;; Obsolete: accepted but not documented
;;
;; Obsolete                     Preferred
;; --------------------------------------------------------
;; (not word-boundary)          not-word-boundary
;; (not-syntax X)               (not (syntax X))
;; not-wordchar                 (not wordchar)
;; (not-char ...)               (not (any ...))
;; any                          nonl, not-newline
;; (repeat N FORM)              (= N FORM)
;; (syntax CHARACTER)           (syntax NAME)
;; (syntax CHAR-SYM)      [1]   (syntax NAME)
;; (category chinse-two-byte)   (category chinese-two-byte)
;; unibyte                      ascii
;; multibyte                    nonascii
;; --------------------------------------------------------
;; [1]  where CHAR-SYM is a symbol with single-character name

;; Obsolescent: accepted and documented but discouraged
;;
;; Obsolescent                    Preferred
;; --------------------------------------------------------
;; (and ...)                      (seq ...), (: ...), (sequence ...)
;; anything                       anychar
;; minimal-match, maximal-match   lazy ops: ??, *?, +?

;; FIXME: Prepare a phase-out by emitting compile-time warnings about
;; at least some of the legacy constructs above.

;;; Code:


;; The `rx--translate...' functions below return (REGEXP . PRECEDENCE),
;; where REGEXP is a list of string expressions that will be
;; concatenated into a regexp, and PRECEDENCE is one of
;;
;;  t    -- can be used as argument to postfix operators (eg. "a")
;;  seq  -- can be concatenated in sequence with other seq or higher (eg. "ab")
;;  lseq -- can be concatenated to the left of rseq or higher (eg. "^a")
;;  rseq -- can be concatenated to the right of lseq or higher (eg. "a$")
;;  nil  -- can only be used in alternatives (eg. "a\\|b")
;;
;; They form a lattice:
;;
;;           t          highest precedence
;;           |
;;          seq
;;         /   \
;;      lseq   rseq
;;         \   /
;;          nil         lowest precedence


(defconst rx--char-classes
  '((digit         . digit)
    (numeric       . digit)
    (num           . digit)
    (control       . cntrl)
    (cntrl         . cntrl)
    (hex-digit     . xdigit)
    (hex           . xdigit)
    (xdigit        . xdigit)
    (blank         . blank)
    (graphic       . graph)
    (graph         . graph)
    (printing      . print)
    (print         . print)
    (alphanumeric  . alnum)
    (alnum         . alnum)
    (letter        . alpha)
    (alphabetic    . alpha)
    (alpha         . alpha)
    (ascii         . ascii)
    (nonascii      . nonascii)
    (lower         . lower)
    (lower-case    . lower)
    (punctuation   . punct)
    (punct         . punct)
    (space         . space)
    (whitespace    . space)
    (white         . space)
    (upper         . upper)
    (upper-case    . upper)
    (word          . word)
    (wordchar      . word)
    (unibyte       . unibyte)
    (multibyte     . multibyte))
  "Alist mapping rx symbols to character classes.
Most of the names are from SRE.")