Function: sgml-html-meta-auto-coding-function

sgml-html-meta-auto-coding-function is a byte-compiled function defined in mule.el.gz.

Signature

(sgml-html-meta-auto-coding-function SIZE)

Documentation

If the buffer has an HTML meta tag, use it to determine encoding.

This function is intended to be added to auto-coding-functions.

Source Code

;; Defined in /usr/src/emacs/lisp/international/mule.el.gz
(defun sgml-html-meta-auto-coding-function (size)
  "If the buffer has an HTML meta tag, use it to determine encoding.
This function is intended to be added to `auto-coding-functions'."
  (let ((case-fold-search t))
    (setq size (min (+ (point) size)
		    (save-excursion
		      ;; Limit the search by the end of the HTML header.
		      (or (search-forward "</head>" (+ (point) size) t)
			  ;; In case of no header, search only 10 lines.
			  (forward-line 10))
		      (point))))
    ;; Make sure that the buffer really contains an HTML document, by
    ;; checking that it starts with a doctype or a <HTML> start tag
    ;; (allowing for whitespace at bob).  Note: 'DOCTYPE NETSCAPE' is
    ;; useful for Mozilla bookmark files.
    (when (and (re-search-forward "\\`[[:space:]\n]*\\(<!doctype[[:space:]\n]+\\(html\\|netscape\\)\\|<html\\)" size t)
	       (re-search-forward "<meta\\s-+\\(http-equiv=[\"']?content-type[\"']?\\s-+content=[\"']text/\\sw+;\\s-*\\)?charset=[\"']?\\(.+?\\)[\"'[:space:]/>]" size t))
      (let* ((match (match-string 2))
	     (sym (intern (downcase match))))
	(if (coding-system-p sym)
            ;; If the encoding tag is UTF-8 and the buffer's
            ;; encoding is one of the variants of UTF-8, use the
            ;; buffer's encoding.  This allows, e.g., saving an
            ;; HTML file as UTF-8 with BOM when the tag says UTF-8.
            (let ((sym-type (coding-system-type sym))
                  (bfcs-type
                   (coding-system-type buffer-file-coding-system)))
              (if (and enable-multibyte-characters
                       ;; 'charset' and 'iso-2022' will signal an error
                       ;; in coding-system-equal, since they aren't
                       ;; coding-systems.  So test that up front.
                       (not (equal sym-type 'charset))
                       (not (equal bfcs-type 'charset))
                       (not (equal sym-type 'iso-2022))
                       (not (equal bfcs-type 'iso-2022))
                       (coding-system-equal 'utf-8 sym-type)
                       (coding-system-equal 'utf-8 bfcs-type))
                  buffer-file-coding-system
		sym))
	  (message "Warning: unknown coding system \"%s\"" match)
	  nil)))))