General Escape Syntax
In addition to the specific escape sequences for special important control characters, Emacs provides several types of escape syntax that you can use to specify non-ASCII text characters.
- You can specify characters by their Unicode names, if any.
?\N{NAME}represents the Unicode character namedNAME. Thus, ‘?\N{LATIN SMALL LETTER A WITH GRAVE}’ is equivalent to?àand denotes the Unicode character U+00E0. To simplify entering multi-line strings, you can replace spaces in the names by non-empty sequences of whitespace (e.g., newlines). - You can specify characters by their Unicode values.
?\N{U+X}represents a character with Unicode code pointX, whereXis a hexadecimal number. Also,?\uxxxxand?\Uxxxxxxxxrepresent code pointsxxxxandxxxxxxxx, respectively, where eachxis a single hexadecimal digit. For example,?\N{U+E0},?\u00e0and?\U000000E0are all equivalent to?àand to ‘?\N{LATIN SMALL LETTER A WITH GRAVE}’. The Unicode Standard defines code points only up to ‘U+10ffff’, so if you specify a code point higher than that, Emacs signals an error. - You can specify characters by their hexadecimal character codes. A hexadecimal escape sequence consists of a backslash, ‘
x’, and the hexadecimal character code. Thus, ‘?\x41’ is the character A, ‘?\x1’ is the character C-a, and?\xe0is the character à (a with grave accent). You can use one or more hex digits after ‘x’, so you can represent any character code in this way. - You can specify characters by their character code in octal. An octal escape sequence consists of a backslash followed by up to three octal digits; thus, ‘
?\101’ for the character A, ‘?\001’ for the character C-a, and?\002for the character C-b. Only characters up to octal code 777 can be specified this way.
These escape sequences may also be used in strings. See Non-ASCII Characters in Strings.