Function: xml-find-file-coding-system
xml-find-file-coding-system is a byte-compiled function defined in
mule.el.gz.
Signature
(xml-find-file-coding-system ARGS)
Documentation
Determine the coding system of an XML file without a declaration.
Strictly speaking, the file should be utf-8, but mistakes are made, and there are genuine cases where XML fragments are saved, with the encoding properly specified in a master document, or added by processing software.
Source Code
;; Defined in /usr/src/emacs/lisp/international/mule.el.gz
(defun xml-find-file-coding-system (args)
"Determine the coding system of an XML file without a declaration.
Strictly speaking, the file should be utf-8, but mistakes are
made, and there are genuine cases where XML fragments are saved,
with the encoding properly specified in a master document, or
added by processing software."
(if (eq (car args) 'insert-file-contents)
(let ((detected
(with-coding-priority '(utf-8)
(coding-system-base
(detect-coding-region (point-min) (point-max) t))))
(bom (list (char-after 1) (char-after 2))))
(cond
((equal bom '(#xFE #xFF))
'utf-16be-with-signature)
((equal bom '(#xFF #xFE))
'utf-16le-with-signature)
;; Pure ASCII always comes back as undecided.
((memq detected '(utf-8 undecided))
'utf-8)
((eq detected 'utf-16le-with-signature) 'utf-16le-with-signature)
((eq detected 'utf-16be-with-signature) 'utf-16be-with-signature)
(t
(warn "File contents detected as %s.
Consider adding an xml declaration with the encoding specified,
or saving as utf-8, as mandated by the xml specification." detected)
detected)))
;; Don't interfere with the user's wishes for saving the buffer.
;; We did what we could when the buffer was created to ensure the
;; correct encoding was used, or the user was warned, so any
;; non-conformity here is deliberate on the part of the user.
'undecided))