5.7 KiB
title | TARGET DECK | FILE TAGS | tags | |
---|---|---|---|---|
Escape Sequences | Obsidian::STEM | c |
|
Overview
C has a standard for processing different escape sequences. Many languages built with C in mind parse these escape sequences in a similar way.
\ooo
: Consists of one to three octal digits.- bash/index supports this sequence as
$'\ooo'
. - gawk/index supports this sequence directly.
- lua/index does not support this kind of escape sequence. Instead, it has a decimal escape sequence
\ddd
.
- bash/index supports this sequence as
%%ANKI
Basic
How are C escape sequences for octal digits denoted?
Back: As \ooo
.
Reference: Brian W. Kernighan and Dennis M. Ritchie, The C Programming Language, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
END%%
%%ANKI
Basic
In C, \ooo
allows specifying how many octal digits?
Back: One to three.
Reference: Brian W. Kernighan and Dennis M. Ritchie, The C Programming Language, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
END%%
%%ANKI
Basic
What alternative does Lua provide to C's \ooo
sequence?
Back: \ddd
, a decimal escape sequence.
Reference: Roberto Ierusalimschy, Programming in Lua, Fourth edition (Rio de Janeiro: Lua.org, 2016).
Tags: lua
END%%
%%ANKI
Basic
How are C escape sequences exposed in bash?
Back: Using ANSI-C quoting, i.e. $$'string'
.
Reference: Mendel Cooper, “Advanced Bash-Scripting Guide,” n.d., 916.
Tags: bash
END%%
\xhh
: Consists of one or more hexadecimal digits. Thex
prefix is required to distinguish from octal escape sequences.- bash/index supports this sequence as
$'\xhh'
. One or two digits is supported. - gawk/index limits processing to two digits.
- Robbins states that using more than two hexadecimal digits can produce undefined results.
- Lua/index requires exactly two digits in its hex escape sequence.
- bash/index supports this sequence as
%%ANKI
Basic
How are C escape sequences for hexadecimal digits denoted?
Back: As \xhh
.
Reference: Brian W. Kernighan and Dennis M. Ritchie, The C Programming Language, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
END%%
%%ANKI
Basic
In C, \x
allows specifying how many hexadecimal digits?
Back: One or more.
Reference: Brian W. Kernighan and Dennis M. Ritchie, The C Programming Language, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
END%%
%%ANKI
Basic
What footgun does C's \x
sequence expose?
Back: Using more than two hexadecimal digits can produce undefined results.
Reference: Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, https://www.gnu.org/software/gawk/manual/gawk.pdf.
END%%
\uhhhh
: Introduced in C11 to represent Unicode code points. Must have exactly four hexadecimal characters specified with0
leading padding if necessary.- bash/index supports this sequence as
$'uhhhh'
. One to four hex digits is supported. - gawk/index consolidates C's
\u
and\U
sequence marker into just\u
, capable of handling one to eight digits. Furthermore,gawk
uses\u
to designate the current locale's character set, not Unicode directly. Often times this is some Unicode-based locale though. - lua/index consolidates C's
\u
and\U
sequence markers into\u{h...h}
, capable of handling one or more hexadecimal digits. The curly braces are required.
- bash/index supports this sequence as
%%ANKI
Basic
What two ways are C escape sequences for unicode denoted?
Back: As \uhhhh
or \Uhhhhhhhh
.
Reference: Jens Gustedt, Modern C (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
END%%
%%ANKI
Basic
In C, \u
allows specifying how many hexadecimal digits?
Back: Exactly four.
Reference: Jens Gustedt, Modern C (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
END%%
%%ANKI
Basic
In what standard were C's \u
and \U
escape sequences introduced?
Back: C11.
Reference: Jens Gustedt, Modern C (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
END%%
%%ANKI
Cloze
\u
in C designates a character in {Unicode}. In gawk
it designates a character in {the current locale's character set}.
Reference: Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, https://www.gnu.org/software/gawk/manual/gawk.pdf.
Tags: unicode gawk
END%%
\Uhhhhhhhh
: Introduced in C11 to represent larger unicode code points. Must have exactly eight hexadecimal characters specified with0
leading padding if necessary.
%%ANKI
Basic
In C, \U
allows specifying how many hexadecimal digits?
Back: Exactly eight.
Reference: Jens Gustedt, Modern C (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
END%%
%%ANKI
Basic
Why does C have both \u
and \U
?
Back: \U
accommodates for larger code point values.
Reference: Jens Gustedt, Modern C (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
END%%
References
- Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, https://www.gnu.org/software/gawk/manual/gawk.pdf.
- Brian W. Kernighan and Dennis M. Ritchie, The C Programming Language, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
- Jens Gustedt, Modern C (Shelter Island, NY: Manning Publications Co, 2020).
- Mendel Cooper, “Advanced Bash-Scripting Guide,” n.d., 916.
- Roberto Ierusalimschy, Programming in Lua, Fourth edition (Rio de Janeiro: Lua.org, 2016).