Function: char-fold-to-regexp
char-fold-to-regexp is an autoloaded and byte-compiled function
defined in char-fold.el.gz.
Signature
(char-fold-to-regexp STRING &optional LAX FROM)
Documentation
Return a regexp matching anything that char-folds into STRING.
Any character in STRING that has an entry in
char-fold-table is replaced with that entry (which is a
regexp) and other characters are regexp-quoted.
When LAX is non-nil, then the final character also matches ligatures partially, for instance, the search string "f" will match "fi", so when typing the search string in isearch while the cursor is on a ligature, the search won't try to immediately advance to the next complete match, but will stay on the partially matched ligature.
If the resulting regexp would be too long for Emacs to handle,
just return the result of calling regexp-quote on STRING.
FROM is for internal use. It specifies an index in the STRING from which to start.
Probably introduced at or before Emacs version 25.1.
Source Code
;; Defined in /usr/src/emacs/lisp/char-fold.el.gz
;;;###autoload
(defun char-fold-to-regexp (string &optional lax from)
"Return a regexp matching anything that char-folds into STRING.
Any character in STRING that has an entry in
`char-fold-table' is replaced with that entry (which is a
regexp) and other characters are `regexp-quote'd.
When LAX is non-nil, then the final character also matches ligatures
partially, for instance, the search string \"f\" will match \"fi\",
so when typing the search string in isearch while the cursor is on
a ligature, the search won't try to immediately advance to the next
complete match, but will stay on the partially matched ligature.
If the resulting regexp would be too long for Emacs to handle,
just return the result of calling `regexp-quote' on STRING.
FROM is for internal use. It specifies an index in the STRING
from which to start."
(let* ((spaces 0)
(multi-char-table (char-table-extra-slot char-fold-table 0))
(i (or from 0))
(end (length string))
(out nil))
;; When the user types a space, we want to match the table entry
;; for ?\s, which is generally a regexp like "[ ...]". However,
;; the `search-spaces-regexp' variable doesn't "see" spaces inside
;; these regexp constructs, so we need to use "\\( \\|[ ...]\\)"
;; instead (to manually expose a space). Furthermore, the lax
;; search engine acts on a bunch of spaces, not on individual
;; spaces, so if the string contains sequential spaces like " ", we
;; need to keep them grouped together like this: "\\( \\|[ ...][ ...]\\)".
(while (< i end)
(pcase (aref string i)
(?\s (setq spaces (1+ spaces)))
((pred (lambda (c) (and char-fold-symmetric
(if isearch-regexp
isearch-regexp-lax-whitespace
isearch-lax-whitespace)
(stringp search-whitespace-regexp)
(string-match-p search-whitespace-regexp (char-to-string c)))))
(setq spaces (1+ spaces)))
(c (when (> spaces 0)
(push (char-fold--make-space-string spaces) out)
(setq spaces 0))
(let ((regexp (or (aref char-fold-table c)
(regexp-quote (string c))))
;; Long string. The regexp would probably be too long.
(alist (unless (> end 50)
(aref multi-char-table c))))
(push (if (and lax alist (= (1+ i) end))
(concat "\\(?:" regexp "\\|"
(mapconcat (lambda (entry)
(cdr entry)) alist "\\|") "\\)")
(let ((matched-entries nil)
(max-length 0))
(dolist (entry alist)
(let* ((suffix (car entry))
(len-suf (length suffix)))
(when (eq (compare-strings suffix 0 nil
string (1+ i) (+ i 1 len-suf)
nil)
t)
(push (cons len-suf (cdr entry)) matched-entries)
(setq max-length (max max-length len-suf)))))
;; If no suffixes matched, just go on.
(if (not matched-entries)
regexp
;;; If N suffixes match, we "branch" out into N+1 executions for the
;;; length of the longest match. This means "fix" will match "fix" but
;;; not "fⅸ", but it's necessary to keep the regexp size from scaling
;;; exponentially. See https://lists.gnu.org/r/emacs-devel/2015-11/msg02562.html
(let ((subs (substring string (1+ i) (+ i 1 max-length))))
;; `i' is still going to inc by 1 below.
(setq i (+ i max-length))
(concat
"\\(?:"
(mapconcat (lambda (entry)
(let ((length (car entry))
(suffix-regexp (cdr entry)))
(concat suffix-regexp
(char-fold-to-regexp subs nil length))))
`((0 . ,regexp) . ,matched-entries) "\\|")
"\\)")))))
out))))
(setq i (1+ i)))
(when (> spaces 0)
(push (char-fold--make-space-string spaces) out))
(apply #'concat (nreverse out))))