Function: bufferpos-to-filepos

bufferpos-to-filepos is an autoloaded and byte-compiled function defined in mule-util.el.gz.

Signature

(bufferpos-to-filepos POSITION &optional QUALITY CODING-SYSTEM)

Documentation

Try to return the file byte corresponding to a particular buffer POSITION.

Value is the file position given as a (0-based) byte count. The function presumes the file is encoded with CODING-SYSTEM, which defaults to buffer-file-coding-system. QUALITY can be:
  approximate, in which case we may cut some corners to avoid
    excessive work.
  exact, in which case we may end up re-(en/de)coding a large
    part of the file/buffer, this can be expensive and slow. (It
    is an error to request the exact method when the buffer's
    EOL format is not yet decided.)
  nil, in which case we may return nil rather than an approximation.

View in manual

Probably introduced at or before Emacs version 25.1.

Source Code

;; Defined in /usr/src/emacs/lisp/international/mule-util.el.gz
;;;###autoload
(defun bufferpos-to-filepos (position &optional quality coding-system)
  "Try to return the file byte corresponding to a particular buffer POSITION.
Value is the file position given as a (0-based) byte count.
The function presumes the file is encoded with CODING-SYSTEM, which defaults
to `buffer-file-coding-system'.
QUALITY can be:
  `approximate', in which case we may cut some corners to avoid
    excessive work.
  `exact', in which case we may end up re-(en/de)coding a large
    part of the file/buffer, this can be expensive and slow.  (It
    is an error to request the `exact' method when the buffer's
    EOL format is not yet decided.)
  nil, in which case we may return nil rather than an approximation."
  (unless coding-system (setq coding-system buffer-file-coding-system))
  (let* ((eol (coding-system-eol-type coding-system))
         (type (coding-system-type coding-system))
         (base (coding-system-base coding-system))
         (point-min 1)                  ;Clarify what the `1' means.
         lineno)
    ;; Handle EOL edge cases.
    (unless (numberp eol)
      (if (eq quality 'exact)
          (error "Unknown EOL format in coding system: %s" coding-system)
        (setq eol 0)))
    (setq lineno (if (= eol 1)
                     (1- (line-number-at-pos position))
                   0))
    (and (eq type 'utf-8)
         ;; Any post-read/pre-write conversions mean it's not really UTF-8.
         (not (null (coding-system-get coding-system :post-read-conversion)))
         (setq type 'not-utf-8))
    (and (memq type '(charset raw-text undecided))
         ;; The following are all of type 'charset', but they are
         ;; actually variable-width encodings.
         (not (memq base '(chinese-gbk chinese-gb18030 euc-tw euc-jis-2004
                                       korean-iso-8bit chinese-iso-8bit
                                       japanese-iso-8bit chinese-big5-hkscs
                                       japanese-cp932 korean-cp949)))
         (setq type 'single-byte))
    (pcase type
      ('utf-8
       (+ (or (position-bytes position)
              (if (<= position 0)
                  point-min
                (position-bytes (point-max))))
          ;; Account for BOM, if any.
          (if (coding-system-get coding-system :bom) 3 0)
          ;; Account for CR in CRLF pairs.
          lineno
          (- point-min)))
      ('single-byte
       (+ position (- point-min) lineno))
      ((and 'utf-16
            ;; FIXME: For utf-16, we could use the same approach as used for
            ;; dos EOLs (counting the number of non-BMP chars instead of the
            ;; number of lines).
            (guard (not (eq quality 'exact))))
       ;; In approximate mode, assume all characters are within the
       ;; BMP, i.e. each one takes up 2 bytes.
       (+ (* (- position point-min) 2)
          ;; Account for BOM, if any.
          (if (coding-system-get coding-system :bom) 2 0)
          ;; Account for CR in CRLF pairs.
          lineno))
      (_
       (pcase quality
         ('approximate (+ (position-bytes position) (- point-min) lineno))
         ('exact
          ;; Rather than assume that the file exists and still holds the right
          ;; data, we reconstruct its relevant portion.
          (let ((buf (current-buffer)))
            (with-temp-buffer
              (set-buffer-multibyte nil)
              (let ((tmp-buf (current-buffer)))
                (with-current-buffer buf
                  (save-restriction
                    (widen)
                    (encode-coding-region (point-min) (min (point-max) position)
                                          coding-system tmp-buf)))
                (buffer-size))))))))))