Variable: word-combining-categories

word-combining-categories is a variable defined in category.c.

Value

((nil . 94) (94) (67 . 72) (67 . 75))

Documentation

List of pairs (cons cells) of categories to determine word boundary.

Emacs treats a sequence of word constituent characters as a single word (i.e. finds no word boundary between them) only if they belong to the same script (see char-script-table). Exceptions to this rule are allowed in the following cases.

(1) The case that characters are in different scripts is controlled
by the variable word-combining-categories.

Emacs finds no word boundary between characters of different scripts if they have categories matching some element of this list.

More precisely, if an element of this list is a cons of category CAT1 and CAT2, and a multibyte character C1 which has CAT1 (but not CAT2) is followed by C2 which has CAT2 (but not CAT1), there's no word boundary between C1 and C2.

For instance, to tell that Han characters followed by Hiragana characters can form a single word, the element (?C . ?H) should be in this list.

CAT1 or CAT2 nil means that any character will do if it doesn't have the other element of the cons in its category set. For instance, to tell that ASCII characters can form a single word with non-ASCII characters, this list should have the elements (nil . ?a) and (?a).

(2) The case that character are in the same script is controlled by
the variable word-separating-categories.

Emacs finds a word boundary between characters of the same script if they have categories matching some element of this list.

More precisely, if an element of this list is a cons of category CAT1 and CAT2, and a multibyte character C1 which has CAT1 but not CAT2 is followed by C2 which has CAT2 but not CAT1, there's a word boundary between C1 and C2.

For instance, to tell that there's a word boundary between Hiragana and Katakana (both are in the same script kana), the element (?H . ?K) should be in this list.

Source Code

// Defined in /usr/src/emacs/src/category.c
  DEFVAR_LISP ("word-combining-categories", Vword_combining_categories,
	       doc: /* List of pairs (cons cells) of categories to determine word boundary.

Emacs treats a sequence of word constituent characters as a single
word (i.e. finds no word boundary between them) only if they belong to
the same script (see `char-script-table').  Exceptions to this rule
are allowed in the following cases.

\(1)