Function: smerge--refine-chopup-region

smerge--refine-chopup-region is a byte-compiled function defined in smerge-mode.el.gz.

Signature

(smerge--refine-chopup-region BEG END FILE &optional PREPROC)

Documentation

Chopup the region from BEG to END into small elements, one per line.

Save the result into FILE. If non-nil, PREPROC is called with no argument in a buffer that contains a copy of the text, just before chopping it up. It can be used to replace chars to try and eliminate some spurious differences.

Source Code

;; Defined in /usr/src/emacs/lisp/vc/smerge-mode.el.gz
(defun smerge--refine-chopup-region (beg end file &optional preproc)
  "Chopup the region from BEG to END into small elements, one per line.
Save the result into FILE.
If non-nil, PREPROC is called with no argument in a buffer that contains
a copy of the text, just before chopping it up.  It can be used to replace
chars to try and eliminate some spurious differences."
  ;; We used to chop up char-by-char rather than word-by-word like ediff
  ;; does.  It had the benefit of simplicity and very fine results, but it
  ;; often suffered from problem that diff would find correlations where
  ;; there aren't any, so the resulting "change" didn't make much sense.
  ;; You can still get this behavior by setting
  ;; `smerge-refine-forward-function' to `forward-char'.
  (with-temp-buffer
    (insert-buffer-substring (marker-buffer beg) beg end)
    (when preproc (goto-char (point-min)) (funcall preproc))
    (when smerge-refine-ignore-whitespace
      ;; It doesn't make much of a difference for diff-fine-highlight
      ;; because we still have the _/+/</>/! prefix anyway.  Can still be
      ;; useful in other circumstances.
      (subst-char-in-region (point-min) (point-max) ?\n ?\s))
    (goto-char (point-min))
    (while (not (eobp))
      (cl-assert (bolp))
      (let ((start (point)))
        (funcall smerge-refine-forward-function 1)
        (let ((len (- (point) start)))
          (cl-assert (>= len 1))
          ;; We add \n after each chunk except after \n, so we get
          ;; one line per text chunk, where each line contains
          ;; just one chunk, except for \n chars which are
          ;; represented by the empty line.
          (unless (bolp) (insert ?\n))
          (when (and smerge-refine-weight-hack (> len 1))
            (let ((s (buffer-substring-no-properties start (point))))
              ;; The weight-hack inserts N copies of words of size N,
              ;; so it naturally suffers from an O(N²) blow up.
              ;; To circumvent this, we map each long word
              ;; to a shorter (but still unique) replacement.
              ;; Another option would be to change smerge--refine-forward
              ;; so it chops up long words into smaller ones.
              (when (> len 8)
                (let ((short (gethash s smerge--refine-long-words)))
                  (unless short
                    ;; To avoid accidental conflicts with ≤8 words,
                    ;; we make sure the replacement is >8 chars.  Overall,
                    ;; this should bound the blowup factor to ~10x,
                    ;; tho if those chars end up encoded as multiple bytes
                    ;; each, it could probably still reach ~30x in
                    ;; pathological cases.
                    (setq short
                          (concat (substring s 0 7)
                                  " "
                                  (string
                                   (+ ?0
                                      (hash-table-count
                                       smerge--refine-long-words)))
                                  "\n"))
                    (puthash s short smerge--refine-long-words))
                  (delete-region start (point))
                  (insert short)
                  (setq s short)))
              (dotimes (_i (1- len)) (insert s)))))))
    (unless (bolp) (error "Smerge refine internal error"))
    (let ((coding-system-for-write 'utf-8-emacs-unix))
      (write-region (point-min) (point-max) file nil 'nomessage))))