Skip to content

External Notation

The basic form of a string-notated bytevector is #u8"CONTENT". The Scheme reader will read them if bytestrings are enabled via (read-enable 'bytestrings), and the Scheme writer will write them if they are enabled via (print-enable 'bytestrings).

To avoid character encoding issues within string-notated bytevectors, only printable ASCII characters (that is, Unicode codepoints in the range from U+0020 to U+007E inclusive) are allowed to be used within the CONTENT of a string-notated bytevector. All other characters must be expressed through mnemonic or inline hex escapes, and " and \ must also be escaped as in normal Scheme strings.

Within the CONTENT of a string-notated bytevector:

  • \a ⇒ 7
  • \b ⇒ 8
  • \t ⇒ 9
  • \n ⇒ 10
  • \r ⇒ 13
  • \" ⇒ 34
  • \\ ⇒ 92
  • \| ⇒ 124
  • the sequence \x followed by zero or more 0 characters, followed by one or two hexadecimal digits, followed by ; represents the integer specified by the hexadecimal digits;
  • the sequence \ followed by zero or more intraline whitespace characters, followed by a newline, followed by zero or more further intraline whitespace characters, is ignored and corresponds to no entry in the resulting bytevector;
  • any other printable ASCII character represents the character number of that character in the ASCII/Unicode code chart; and
  • it is an error to use any other character or sequence beginning with \ within a string-notated bytevector.

Note: The \| sequence is provided so that string parsing, symbol parsing, and string-notated bytevector parsing can all use the same sequences. However, we give a complete definition of the valid lexical syntax in this SRFI rather than inheriting the native syntax of strings, so that it is clear that #u8"ι" and #u8"\xE000;" are invalid.

When the Scheme reader encounters a string-notated bytevector, it produces a datum as if that bytevector had been written out in full. That is, #u8"A" is exactly equivalent to #u8(65).