Function: `split-string`

split-string is a byte-compiled function defined in subr.el.gz.

Signature

(split-string STRING &optional SEPARATORS OMIT-EMPTY TRIM)

Documentation

Split STRING into substrings bounded by matches for SEPARATORS.

The beginning and end of STRING, and each match for SEPARATORS, are splitting points. The substrings matching SEPARATORS are removed, and the substrings between the splitting points are collected as a list, which is returned.

If SEPARATORS is non-nil, it should be a regular expression matching text that separates, but is not part of, the substrings. If omitted or nil, it defaults to split-string-default-separators, whose value is normally "[ \\f\\t\\n\\r\\v]+", and OMIT-EMPTY is then forced to t. SEPARATORS should never be a regexp that matches the empty string.

If OMIT-EMPTY is t, zero-length substrings are omitted from the list (so that for the default value of SEPARATORS leading and trailing whitespace are effectively trimmed). If nil, all zero-length substrings are retained, which correctly parses CSV format, for example.

If TRIM is non-nil, it should be a regular expression to match text to trim from the beginning and end of each substring. If trimming makes the substring empty and OMIT-EMPTY is t, it is dropped from the result.

Note that the effect of (split-string STRING) is the same as
(split-string STRING split-string-default-separators t). In the rare
case that you wish to retain zero-length substrings when splitting on whitespace, use (split-string STRING split-string-default-separators).

Modifies the match data; use save-match-data if necessary.

Other relevant functions are documented in the string group.

Probably introduced at or before Emacs version 20.1.

Shortdoc

;; string
(split-string "foo bar")
    => ("foo" "bar")
  (split-string "|foo|bar|" "|")
    => ("" "foo" "bar" "")
  (split-string "|foo|bar|" "|" t)
    => ("foo" "bar")

Aliases

Source Code

;; Defined in /usr/src/emacs/lisp/subr.el.gz
(defun split-string (string &optional separators omit-empty trim)
  "Split STRING into substrings bounded by matches for SEPARATORS.

The beginning and end of STRING, and each match for SEPARATORS, are
splitting points.  The substrings matching SEPARATORS are removed, and
the substrings between the splitting points are collected as a list,
which is returned.

If SEPARATORS is non-nil, it should be a regular expression matching text
that separates, but is not part of, the substrings.  If omitted or nil,
it defaults to `split-string-default-separators', whose value is
normally \"[ \\f\\t\\n\\r\\v]+\", and OMIT-EMPTY is then forced to t.
SEPARATORS should never be a regexp that matches the empty string.

If OMIT-EMPTY is t, zero-length substrings are omitted from the list (so
that for the default value of SEPARATORS leading and trailing whitespace
are effectively trimmed).  If nil, all zero-length substrings are retained,
which correctly parses CSV format, for example.

If TRIM is non-nil, it should be a regular expression to match
text to trim from the beginning and end of each substring.  If trimming
makes the substring empty and OMIT-EMPTY is t, it is dropped from the result.

Note that the effect of `(split-string STRING)' is the same as
`(split-string STRING split-string-default-separators t)'.  In the rare
case that you wish to retain zero-length substrings when splitting on
whitespace, use `(split-string STRING split-string-default-separators)'.

Modifies the match data; use `save-match-data' if necessary."
  (declare (important-return-value t))
  (let* ((keep-empty (and separators (not omit-empty)))
	 (len (length string))
         (trim-left-re (and trim (concat "\\`\\(?:" trim "\\)")))
         (trim-right-re (and trim (concat "\\(?:" trim "\\)\\'")))
         (sep-re (or separators split-string-default-separators))
         (acc nil)
         (next 0)
         (start 0))
    (while
        ;; TODO: The semantics for empty matches are just a copy of
        ;; the original code and make no sense at all. It's just a
        ;; consequence of the original implementation, no thought behind it.
        ;; We should probably error on empty matches, except when
        ;; sep is "" (which is in use by some code) but in that case
        ;; we could provide a faster implementation.
        (let ((sep (string-match sep-re string next)))
          (and sep
               (let ((sep-end (match-end 0)))
                 (when (or keep-empty (< start sep))
                   ;; TODO: Ideally we'd be able to trim in the
                   ;; original string and only make a substring after
                   ;; doing so, but there is no way to bound a regexp
                   ;; search before a certain offset, nor to anchor it
                   ;; at the search boundaries.
                   (let ((item (substring string start sep)))
                     (if trim
                         (let* ((item-beg
                                 (if (string-match trim-left-re item 0)
                                     (match-end 0)
                                   0))
                                (item-len (length item))
                                (item-end
                                 (or (string-match-p trim-right-re
                                                     item item-beg)
                                     item-len)))
                           (when (or (> item-beg 0) (< item-end item-len))
                             (setq item (substring item item-beg item-end)))
                           (when (or keep-empty (< item-beg item-end))
                             (push item acc)))
                       (push item acc))))
                 ;; This ensures progress in case the match was empty.
                 (setq next (max (1+ next) sep-end))
                 (setq start sep-end)
                 (< start len)))))
    ;; field after last separator, if any
    (let ((item (if (= start 0)
                    string    ; optimisation when there is no separator
                  (substring string start))))
      (when trim
        (let* ((item-beg (if (string-match trim-left-re item 0)
                             (match-end 0)
                           0))
               (item-len (length item))
               (item-end (or (string-match-p trim-right-re item item-beg)
                             item-len)))
          (when (or (> item-beg 0) (< item-end item-len))
            (setq item (substring item item-beg item-end)))))
      (when (or keep-empty (not (equal item "")))
        (push item acc)))
    (nreverse acc)))