Function: puny-highly-restrictive-string-p

puny-highly-restrictive-string-p is a byte-compiled function defined in puny.el.gz.

Signature

(puny-highly-restrictive-string-p STRING)

Documentation

Say whether STRING is "highly restrictive" in the Unicode IDNA sense.

See https://www.unicode.org/reports/tr39/#Restriction_Level_Detection for details. The main idea is that if you're mixing scripts (like latin and cyrillic), you may confuse the user by using homographs.

Source Code

;; Defined in /usr/src/emacs/lisp/net/puny.el.gz
;; https://www.unicode.org/reports/tr39/#Restriction_Level_Detection
;; https://www.unicode.org/reports/tr31/#Table_Candidate_Characters_for_Inclusion_in_Identifiers

(defun puny-highly-restrictive-string-p (string)
  "Say whether STRING is \"highly restrictive\" in the Unicode IDNA sense.
See https://www.unicode.org/reports/tr39/#Restriction_Level_Detection
for details.  The main idea is that if you're mixing
scripts (like latin and cyrillic), you may confuse the user by
using homographs."
  (let ((scripts
         (delq
          t
          (seq-uniq
           (seq-map (lambda (char)
                      (if (memq char
                                ;; These characters are always allowed
                                ;; in any string.
                                '(#x0027 ; APOSTROPHE
                                  #x002D ; HYPHEN-MINUS
                                  #x002E ; FULL STOP
                                  #x003A ; COLON
                                  #x00B7 ; MIDDLE DOT
                                  #x058A ; ARMENIAN HYPHEN
                                  #x05F3 ; HEBREW PUNCTUATION GERESH
                                  #x05F4 ; HEBREW PUNCTUATION GERSHAYIM
                                  #x0F0B ; TIBETAN MARK INTERSYLLABIC TSHEG
                                  #x200C ; ZERO WIDTH NON-JOINER*
                                  #x200D ; ZERO WIDTH JOINER*
                                  #x2010 ; HYPHEN
                                  #x2019 ; RIGHT SINGLE QUOTATION MARK
                                  #x2027 ; HYPHENATION POINT
                                  #x30A0 ; KATAKANA-HIRAGANA DOUBLE HYPHEN
                                  #x30FB)) ; KATAKANA MIDDLE DOT
                          t
                        (aref char-script-table char)))
                    string)))))
    (or
     ;; Every character uses the same script.
     (= (length scripts) 1)
     (seq-some 'identity
               (mapcar (lambda (list)
                         (seq-every-p (lambda (script)
                                        (memq script list))
                                      scripts))
                       '((latin han hiragana kana)
                         (latin han bopomofo)
                         (latin han hangul)))))))