Function: string-limit
string-limit is an autoloaded and byte-compiled function defined in
subr-x.el.gz.
Signature
(string-limit STRING LENGTH &optional END CODING-SYSTEM)
Documentation
Return a substring of STRING that is (up to) LENGTH characters long.
If STRING is shorter than or equal to LENGTH characters, return the entire string unchanged.
If STRING is longer than LENGTH characters, return a substring consisting of the first LENGTH characters of STRING. If END is non-nil, return the last LENGTH characters instead.
If CODING-SYSTEM is non-nil, STRING will be encoded before
limiting, and LENGTH is interpreted as the number of bytes to
limit the string to. The result will be a unibyte string that is
shorter than LENGTH, but will not contain "partial"
characters (or glyphs), even if CODING-SYSTEM encodes characters
with several bytes per character. If the coding system specifies
prefix like the byte order mark (aka "BOM") or a shift-in sequence,
their bytes will be normally counted as part of LENGTH. This is
the case, for instance, with utf-16. If this isn't desired, use a
coding system that doesn't specify a BOM, like utf-16le or utf-16be.
When shortening strings for display purposes,
truncate-string-to-width is almost always a better alternative
than this function.
Other relevant functions are documented in the string group.
Probably introduced at or before Emacs version 28.1.
Shortdoc
;; string
(string-limit "foobar" 3)
=> "foo"
(string-limit "foobar" 3 t)
=> "bar"
(string-limit "foobar" 10)
=> "foobar"
(string-limit "fo好" 3 nil 'utf-8)
=> "fo"
Source Code
;; Defined in /usr/src/emacs/lisp/emacs-lisp/subr-x.el.gz
;;;###autoload
(defun string-limit (string length &optional end coding-system)
"Return a substring of STRING that is (up to) LENGTH characters long.
If STRING is shorter than or equal to LENGTH characters, return the
entire string unchanged.
If STRING is longer than LENGTH characters, return a substring
consisting of the first LENGTH characters of STRING. If END is
non-nil, return the last LENGTH characters instead.
If CODING-SYSTEM is non-nil, STRING will be encoded before
limiting, and LENGTH is interpreted as the number of bytes to
limit the string to. The result will be a unibyte string that is
shorter than LENGTH, but will not contain \"partial\"
characters (or glyphs), even if CODING-SYSTEM encodes characters
with several bytes per character. If the coding system specifies
prefix like the byte order mark (aka \"BOM\") or a shift-in sequence,
their bytes will be normally counted as part of LENGTH. This is
the case, for instance, with `utf-16'. If this isn't desired, use a
coding system that doesn't specify a BOM, like `utf-16le' or `utf-16be'.
When shortening strings for display purposes,
`truncate-string-to-width' is almost always a better alternative
than this function."
(declare (important-return-value t))
(unless (natnump length)
(signal 'wrong-type-argument (list 'natnump length)))
(if coding-system
;; The previous implementation here tried to encode char by
;; char, and then adding up the length of the encoded octets,
;; but that's not reliably in the presence of BOM marks and
;; ISO-2022-CN which may add charset designations at the
;; start/end of each encoded char (which we don't want). So
;; iterate (with a binary search) instead to find the desired
;; length.
(let* ((glyphs (string-glyph-split string))
(nglyphs (length glyphs))
(too-long (1+ nglyphs))
(stop (max (/ nglyphs 2) 1))
(gap stop)
candidate encoded found candidate-stop)
;; We're returning the end of the string.
(when end
(setq glyphs (nreverse glyphs)))
(while (and (not found)
(< stop too-long))
(setq encoded
(encode-coding-string (string-join (seq-take glyphs stop))
coding-system))
(cond
((= (length encoded) length)
(setq found encoded
candidate-stop stop))
;; Too long; try shortening.
((> (length encoded) length)
(setq too-long stop
stop (max (- stop gap) 1)))
;; Too short; try lengthening.
(t
(setq candidate encoded
candidate-stop stop)
(setq stop
(if (>= stop nglyphs)
too-long
(min (+ stop gap) nglyphs)))))
(setq gap (max (/ gap 2) 1)))
(cond
((not (or found candidate))
"")
;; We're returning the end, so redo the encoding.
(end
(encode-coding-string
(string-join (nreverse (seq-take glyphs candidate-stop)))
coding-system))
(t
(or found candidate))))
;; Char-based version.
(cond
((<= (length string) length) string)
(end (substring string (- (length string) length)))
(t (substring string 0 length)))))