Function: filepos-to-bufferpos
filepos-to-bufferpos is an autoloaded and byte-compiled function
defined in mule-util.el.gz.
Signature
(filepos-to-bufferpos BYTE &optional QUALITY CODING-SYSTEM)
Documentation
Try to return the buffer position corresponding to a particular file position.
The file position is given as a (0-based) BYTE count.
The function presumes the file is encoded with CODING-SYSTEM, which defaults
to buffer-file-coding-system.
QUALITY can be:
approximate, in which case we may cut some corners to avoid
excessive work.
exact, in which case we may end up re-(en/de)coding a large
part of the file/buffer, this can be expensive and slow. (It
is an error to request the exact method when the buffer's
EOL format is not yet decided.)
nil, in which case we may return nil rather than an approximation.
Probably introduced at or before Emacs version 25.1.
Source Code
;; Defined in /usr/src/emacs/lisp/international/mule-util.el.gz
;;;###autoload
(defun filepos-to-bufferpos (byte &optional quality coding-system)
"Try to return the buffer position corresponding to a particular file position.
The file position is given as a (0-based) BYTE count.
The function presumes the file is encoded with CODING-SYSTEM, which defaults
to `buffer-file-coding-system'.
QUALITY can be:
`approximate', in which case we may cut some corners to avoid
excessive work.
`exact', in which case we may end up re-(en/de)coding a large
part of the file/buffer, this can be expensive and slow. (It
is an error to request the `exact' method when the buffer's
EOL format is not yet decided.)
nil, in which case we may return nil rather than an approximation."
(unless coding-system (setq coding-system buffer-file-coding-system))
(let ((eol (coding-system-eol-type coding-system))
(type (coding-system-type coding-system))
(base (coding-system-base coding-system))
(pm (save-restriction (widen) (point-min))))
;; Handle EOL edge cases.
(unless (numberp eol)
(if (eq quality 'exact)
(error "Unknown EOL format in coding system: %s" coding-system)
(setq eol 0)))
(and (eq type 'utf-8)
;; Any post-read/pre-write conversions mean it's not really UTF-8.
(not (null (coding-system-get coding-system :post-read-conversion)))
(setq type 'not-utf-8))
(and (memq type '(charset raw-text undecided))
;; The following are all of type 'charset', but they are
;; actually variable-width encodings.
(not (memq base '(chinese-gbk chinese-gb18030 euc-tw euc-jis-2004
korean-iso-8bit chinese-iso-8bit
japanese-iso-8bit chinese-big5-hkscs
japanese-cp932 korean-cp949)))
(setq type 'single-byte))
(pcase type
('utf-8
(when (coding-system-get coding-system :bom)
(setq byte (max 0 (- byte 3))))
(if (= eol 1)
(filepos-to-bufferpos--dos (+ pm byte) #'byte-to-position)
(byte-to-position (+ pm byte))))
('single-byte
(if (= eol 1)
(filepos-to-bufferpos--dos (+ pm byte) #'identity)
(+ pm byte)))
((and 'utf-16
;; FIXME: For utf-16, we could use the same approach as used for
;; dos EOLs (counting the number of non-BMP chars instead of the
;; number of lines).
(guard (not (eq quality 'exact))))
;; Account for BOM, which is always 2 bytes in UTF-16.
(when (coding-system-get coding-system :bom)
(setq byte (max 0 (- byte 2))))
;; In approximate mode, assume all characters are within the
;; BMP, i.e. take up 2 bytes.
(setq byte (/ byte 2))
(if (= eol 1)
(filepos-to-bufferpos--dos (+ pm byte) #'identity)
(+ pm byte)))
(_
(pcase quality
('approximate (byte-to-position (+ pm byte)))
('exact
;; Rather than assume that the file exists and still holds the right
;; data, we reconstruct it based on the buffer's content.
(let ((buf (current-buffer)))
(with-temp-buffer
(set-buffer-multibyte nil)
(let ((tmp-buf (current-buffer)))
(with-current-buffer buf
(save-restriction
(widen)
;; Since encoding should always return more bytes than
;; there were chars, encoding all chars up to (+ byte pm)
;; guarantees the encoded result has at least `byte' bytes.
(encode-coding-region pm (min (point-max) (+ pm byte))
coding-system tmp-buf)))
(+ pm (length
(decode-coding-region (point-min)
(min (point-max) (+ pm byte))
coding-system t))))))))))))