Function: encode-coding-char

encode-coding-char is a byte-compiled function defined in mule-cmds.el.gz.

Signature

(encode-coding-char CHAR CODING-SYSTEM &optional CHARSET)

Documentation

Encode CHAR by CODING-SYSTEM and return the resulting string of bytes.

If CODING-SYSTEM can't safely encode CHAR, return nil. The 3rd optional argument CHARSET, if non-nil, is a charset preferred on encoding.

Source Code

;; Defined in /usr/src/emacs/lisp/international/mule-cmds.el.gz
(defun encode-coding-char (char coding-system &optional charset)
  "Encode CHAR by CODING-SYSTEM and return the resulting string of bytes.
If CODING-SYSTEM can't safely encode CHAR, return nil.
The 3rd optional argument CHARSET, if non-nil, is a charset preferred
on encoding."
  (let* ((str1 (string char))
	 (str2 (string char char))
	 (found (find-coding-systems-string str1))
         (bom-p (coding-system-get coding-system :bom))
	 enc1 enc2 i0 i1 i2)
    ;; If CHAR is ASCII and CODING-SYSTEM doesn't prepend a BOM, just
    ;; encode CHAR.
    (if (and (eq (car-safe found) 'undecided)
             (null bom-p))
	(encode-coding-string str1 coding-system)
      (when (or (eq (car-safe found) 'undecided)
                (memq (coding-system-base coding-system) found))
	;; We must find the encoded string of CHAR.  But, just encoding
	;; CHAR will put extra control sequences (usually to designate
	;; ASCII charset) at the tail if type of CODING is ISO 2022.
	;; To exclude such tailing bytes, we at first encode one-char
	;; string and two-char string, then check how many bytes at the
	;; tail of both encoded strings are the same.

	(when charset
	  (put-text-property 0 1 'charset charset str1)
	  (put-text-property 0 2 'charset charset str2))
	(setq enc1 (encode-coding-string str1 coding-system)
	      i1 (length enc1)
	      enc2 (encode-coding-string str2 coding-system)
	      i2 (length enc2))
	(while (and (> i1 0) (= (aref enc1 (1- i1)) (aref enc2 (1- i2))))
	  (setq i1 (1- i1) i2 (1- i2)))

	;; Now (substring enc1 i1) and (substring enc2 i2) are the same,
	;; and they are the extra control sequences at the tail to
	;; exclude.

        ;; We also need to exclude the leading 2 or 3 bytes if they
        ;; come from a BOM.
        (setq i0
              (if bom-p
                  (cond
                   ((eq (coding-system-type coding-system) 'utf-8)
                    3)
                   ((eq (coding-system-type coding-system) 'utf-16)
                    2)
                   (t 0))
                0))
	(substring enc2 i0 i2)))))