Back: Using more than two hexadecimal digits can produce undefined results.
Reference: Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf).
<!--ID: 1706975891828-->
END%%
*`\uhhhh`: Introduced in C11 to represent Unicode code points. *Must* have exactly four hexadecimal characters specified with `0` leading padding if necessary.
* [[bash/index|Bash]] supports this sequence as `$'uhhhh'`. One to four hex digits is supported.
* [[gawk/index|gawk]] consolidates C's `\u` and `\U` sequence marker into just `\u`, capable of handling one to eight digits. Furthermore, `gawk` uses `\u` to designate the current locale's character set, *not* Unicode directly. Often times this is some Unicode-based locale though.
* [[lua/index|Lua]] consolidates C's `\u` and `\U` sequence markers into `\u{h...h}`, capable of handling one or more hexadecimal digits. The curly braces are required.
%%ANKI
Basic
What two ways are C escape sequences for unicode denoted?
`\u` in C designates a character in {Unicode}. In `gawk` it designates a character in {the current locale's character set}.
Reference: Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf).
Tags: unicode gawk
<!--ID: 1706976541399-->
END%%
*`\Uhhhhhhhh`: Introduced in C11 to represent larger unicode code points. *Must* have exactly eight hexadecimal characters specified with `0` leading padding if necessary.
%%ANKI
Basic
In C, `\U` allows specifying how many hexadecimal digits?
* Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf).