Better organization. Escape sequences. More gawk.

Joshua Potter 2024-02-03 09:23:18 -07:00
parent 53171cc3d0
commit 52df11cd84
10 changed files with 255 additions and 43 deletions

View File

@ -64,23 +64,28 @@
"daily/2024-01-31.md": "72e343cef8d56e169cac7b360a88fcf0", "daily/2024-01-31.md": "72e343cef8d56e169cac7b360a88fcf0",
"posix/index.md": "f7b1ae55f8f5e8f50f89738b1aca9111", "posix/index.md": "f7b1ae55f8f5e8f50f89738b1aca9111",
"posix/signals.md": "2120ddd933fc0d57abb93c33f639afd8", "posix/signals.md": "2120ddd933fc0d57abb93c33f639afd8",
"gawk.md": "c79a09e1772da1d4f2cd296b7370ecab", "gawk.md": "20ed75ec33b5e08933c81aac01eb64b5",
"bash/index.md": "3b5296277f095acdf16655adcdf524af", "bash/index.md": "3b5296277f095acdf16655adcdf524af",
"bash/shebang.md": "9006547710f9a079a3666169fbeda7aa", "bash/shebang.md": "9006547710f9a079a3666169fbeda7aa",
"bash/robustness.md": "de97cd77aae047b5eea27440b43c9c42", "bash/robustness.md": "a1d0d334939b54cca4bdfd2fd8ca27f0",
"journal/2024-02-01.md": "3aa232387d2dc662384976fd116888eb", "journal/2024-02-01.md": "3aa232387d2dc662384976fd116888eb",
"journal/2024-01-31.md": "7c7fbfccabc316f9e676826bf8dfe970", "journal/2024-01-31.md": "7c7fbfccabc316f9e676826bf8dfe970",
"bash/quoting.md": "b1d8869a91001f8b22f0cdc54d806f61", "bash/quoting.md": "b1d8869a91001f8b22f0cdc54d806f61",
"nix/callPackage.md": "5ef6bc5d1a549c55d43ebb4d48c64427", "nix/callPackage.md": "59796c480e2856fa7491f62ceb7e3c9c",
"nix/index.md": "dd5ddd19e95d9bdbe020c68974d77a33", "nix/index.md": "dd5ddd19e95d9bdbe020c68974d77a33",
"logic/propositional.md": "2e79b758dd161cc377f18cd1fa845285", "logic/propositional.md": "2e79b758dd161cc377f18cd1fa845285",
"logic/index.md": "d41d8cd98f00b204e9800998ecf8427e", "logic/index.md": "d41d8cd98f00b204e9800998ecf8427e",
"journal/2024-02-03.md": "1359532c63e14251da04181fb0873d66", "journal/2024-02-03.md": "1359532c63e14251da04181fb0873d66",
"journal/2024-02-02.md": "e2acbe75752d9c39875553223e34fb0d", "bash/prompts.md": "61cb877e68da040a15b85af76b1f68ba",
"bash/prompts.md": "64bd3cd3c2feb9edb68ad8dc5ba65a35",
"algorithms/sorting/index.md": "cd189e1a2cf32b5656b16aaf9f488874", "algorithms/sorting/index.md": "cd189e1a2cf32b5656b16aaf9f488874",
"algorithms/sorting/insertion-sort.md": "c78c9983f87cdc4198f82803d418967f", "algorithms/sorting/insertion-sort.md": "c78c9983f87cdc4198f82803d418967f",
"algorithms/index.md": "1583c07edea4736db27c38fe2b6c4c31" "algorithms/index.md": "1583c07edea4736db27c38fe2b6c4c31",
"c/escape-sequences.md": "07f0811b0fff14f54f78abc33f2e6606",
"journal/2024-02-03.md": "fca2323a3284ca8cd0a1aea9e354bb80",
"lua/index.md": "d41d8cd98f00b204e9800998ecf8427e",
"c/index.md": "d41d8cd98f00b204e9800998ecf8427e",
"gawk/variables.md": "4482c297e7f4f5987f42f1926a880ca7",
"gawk/index.md": "0263448c8ae1ecfc0eacc4788f8402e9"
}, },
"fields_dict": { "fields_dict": {
"Basic": [ "Basic": [

View File

@ -8,7 +8,7 @@ tags:
## Overview ## Overview
According to Robbins a POSIX-compliant shell (like Bash) generally has the primary and secondary prompts denoted with `$` and `>` respectively. According to Robbins a POSIX-compliant shell (like Bash) generally has the primary and secondary prompts denoted with `$` and `>` respectively. Adjust these values using environment variables `$PS1` and `PS2` respectively.
%%ANKI %%ANKI
Basic Basic
@ -18,6 +18,14 @@ Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 20
<!--ID: 1706882670149--> <!--ID: 1706882670149-->
END%% END%%
%%ANKI
Basic
What environment variable controls Bash's primary prompt?
Back: `$$PS1`
Reference: Cooper, Mendel. “Advanced Bash-Scripting Guide,” n.d., 916.
<!--ID: 1706973587222-->
END%%
%%ANKI %%ANKI
Basic Basic
What symbol is usually used to denote the secondary prompt? What symbol is usually used to denote the secondary prompt?
@ -26,6 +34,14 @@ Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 20
<!--ID: 1706882670158--> <!--ID: 1706882670158-->
END%% END%%
%%ANKI
Basic
What environment variable controls Bash's secondary prompt?
Back: `$$PS2`
Reference: Cooper, Mendel. “Advanced Bash-Scripting Guide,” n.d., 916.
<!--ID: 1706973587232-->
END%%
Paths supplied to commands are typically "sanitized" by prefixing the path name with `./`. This is mentioned in a few different places: Paths supplied to commands are typically "sanitized" by prefixing the path name with `./`. This is mentioned in a few different places:
* `find -execdir` performs this prefixing automatically on all found files. * `find -execdir` performs this prefixing automatically on all found files.

View File

@ -9,7 +9,7 @@ tags:
## Overview ## Overview
An interesting point Robbins discusses in his introduction to [[gawk]] is this idea of command robustness. He states that: An interesting point Robbins discusses in his introduction to [[gawk/index|gawk]] is this idea of command robustness. He states that:
> A self-contained shell script is more reliable because there are no other files to misplace. > A self-contained shell script is more reliable because there are no other files to misplace.
@ -56,6 +56,12 @@ It's interesting to think what else can be used as a measure of a command's robu
* What happens if I run the command twice in a row? * What happens if I run the command twice in a row?
* Whether a program acts atomically * Whether a program acts atomically
* Is it possible intermediate files are left that affect subsequent runs? * Is it possible intermediate files are left that affect subsequent runs?
* The presence of timeouts
* Perhaps a program waits a specified amount of time before input is available. The command's success is now externally determined.
* Locale-aware functionality
* Consider for instance [[gawk/index|gawk]]'s `\u` [[escape-sequences|sequence]] which targets characters in the current locale's character set as opposed to specifically Unicode.
The above scenarios are what makes something like [[nix/index|nix]] so compelling.
## References ## References

148
notes/c/escape-sequences.md Normal file
View File

@ -0,0 +1,148 @@
---
title: Escape Sequences
TARGET DECK: Obsidian::STEM
FILE TAGS: c
tags:
- c
---
## Overview
C has a standard for processing different escape sequences. Many languages built with C in mind parse these escape sequences in a similar way.
* `\ooo`: Consists of one to three octal digits.
* [[bash/index|Bash]] supports this sequence as `$'\ooo'`.
* [[gawk/index|gawk]] supports this sequence directly.
* [[lua/index|Lua]] does not support this kind of escape sequence. Instead, it has a *decimal* escape sequence `\ddd`.
%%ANKI
Basic
How are C escape sequences for octal digits denoted?
Back: As `\ooo`.
Reference: Brian W. Kernighan and Dennis M. Ritchie, _The C Programming Language_, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
<!--ID: 1706975891805-->
END%%
%%ANKI
Basic
In C, `\ooo` allows specifying how many octal digits?
Back: One to three.
Reference: Brian W. Kernighan and Dennis M. Ritchie, _The C Programming Language_, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
<!--ID: 1706975891810-->
END%%
%%ANKI
Basic
What alternative does Lua provide to C's `\ooo` sequence?
Back: `\ddd`, a *decimal* escape sequence.
Reference: Roberto Ierusalimschy, _Programming in Lua_, Fourth edition (Rio de Janeiro: Lua.org, 2016).
Tags: lua
<!--ID: 1706975891813-->
END%%
%%ANKI
Basic
How are C escape sequences exposed in bash?
Back: Using ANSI-C quoting, i.e. `$$'string'`.
Reference: Mendel Cooper, “Advanced Bash-Scripting Guide,” n.d., 916.
Tags: bash
<!--ID: 1706975891817-->
END%%
* `\xhh`: Consists of one or more hexadecimal digits. The `x` prefix is required to distinguish from octal escape sequences.
* [[bash/index|Bash]] supports this sequence as `$'\xhh'`. One or two digits is supported.
* [[gawk/index|gawk]] limits processing to two digits.
* Robbins states that using more than two hexadecimal digits can produce undefined results.
* [[Lua/index|Lua]] requires *exactly* two digits in its hex escape sequence.
%%ANKI
Basic
How are C escape sequences for hexadecimal digits denoted?
Back: As `\xhh`.
Reference: Brian W. Kernighan and Dennis M. Ritchie, _The C Programming Language_, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
<!--ID: 1706975891820-->
END%%
%%ANKI
Basic
In C, `\x` allows specifying how many hexadecimal digits?
Back: One or more.
Reference: Brian W. Kernighan and Dennis M. Ritchie, _The C Programming Language_, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
<!--ID: 1706975891824-->
END%%
%%ANKI
Basic
What footgun does C's `\x` sequence expose?
Back: Using more than two hexadecimal digits can produce undefined results.
Reference: Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf).
<!--ID: 1706975891828-->
END%%
* `\uhhhh`: Introduced in C11 to represent Unicode code points. *Must* have exactly four hexadecimal characters specified with `0` leading padding if necessary.
* [[bash/index|Bash]] supports this sequence as `$'uhhhh'`. One to four hex digits is supported.
* [[gawk/index|gawk]] consolidates C's `\u` and `\U` sequence marker into just `\u`, capable of handling one to eight digits. Furthermore, `gawk` uses `\u` to designate the current locale's character set, *not* Unicode directly. Often times this is some Unicode-based locale though.
* [[lua/index|Lua]] consolidates C's `\u` and `\U` sequence markers into `\u{h...h}`, capable of handling one or more hexadecimal digits. The curly braces are required.
%%ANKI
Basic
What two ways are C escape sequences for unicode denoted?
Back: As `\uhhhh` or `\Uhhhhhhhh`.
Reference: Jens Gustedt, _Modern C_ (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
<!--ID: 1706975891832-->
END%%
%%ANKI
Basic
In C, `\u` allows specifying how many hexadecimal digits?
Back: Exactly four.
Reference: Jens Gustedt, _Modern C_ (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
<!--ID: 1706975891835-->
END%%
%%ANKI
Basic
In what standard were C's `\u` and `\U` escape sequences introduced?
Back: C11.
Reference: Jens Gustedt, _Modern C_ (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
<!--ID: 1706975891839-->
END%%
%%ANKI
Cloze
`\u` in C designates a character in {Unicode}. In `gawk` it designates a character in {the current locale's character set}.
Reference: Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf).
Tags: unicode gawk
<!--ID: 1706976541399-->
END%%
* `\Uhhhhhhhh`: Introduced in C11 to represent larger unicode code points. *Must* have exactly eight hexadecimal characters specified with `0` leading padding if necessary.
%%ANKI
Basic
In C, `\U` allows specifying how many hexadecimal digits?
Back: Exactly eight.
Reference: Jens Gustedt, _Modern C_ (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
<!--ID: 1706975891843-->
END%%
%%ANKI
Basic
Why does C have both `\u` and `\U`?
Back: `\U` accommodates for larger code point values.
Reference: Jens Gustedt, _Modern C_ (Shelter Island, NY: Manning Publications Co, 2020).
Tags: unicode
<!--ID: 1706976705750-->
END%%
## References
* Arnold D. Robbins, “GAWK: Effective AWK Programming,” October 2023, [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf).
* Brian W. Kernighan and Dennis M. Ritchie, _The C Programming Language_, 2nd ed (Englewood Cliffs, N.J: Prentice Hall, 1988).
* Jens Gustedt, _Modern C_ (Shelter Island, NY: Manning Publications Co, 2020).
* Mendel Cooper, “Advanced Bash-Scripting Guide,” n.d., 916.
* Roberto Ierusalimschy, _Programming in Lua_, Fourth edition (Rio de Janeiro: Lua.org, 2016).

0
notes/c/index.md Normal file
View File

View File

@ -3,8 +3,6 @@ title: GAWK
TARGET DECK: Obsidian::STEM TARGET DECK: Obsidian::STEM
FILE TAGS: linux::cli gawk FILE TAGS: linux::cli gawk
tags: tags:
- linux
- cli
- gawk - gawk
--- ---
@ -198,39 +196,12 @@ Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 202
<!--ID: 1706883732944--> <!--ID: 1706883732944-->
END%% END%%
## Variables ## Exit Status
Variables are defined like `var=val`. They can be specified in two different places: On success, `gawk` exits with status code `EXIT_SUCCESS`. On failure, with status code `EXIT_FAILURE`. On fatal error, `gawk` exists with status code `2`. #c
1. Via the `-v` command line flag. Using this allows accessing the variable value from within a `BEGIN` rule. You can specify a custom exit status by using the `exit` statement from within the `awk` program.
2. In the file list. Using this allows accessing the variable value in all subsequent file processing.
%%ANKI
Basic
Where in an `awk` invocation can variables be assigned?
Back: As a `-v` argument or in the file list.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)
<!--ID: 1706885111450-->
END%%
%%ANKI
Basic
The `-v` flag was introduced to accommodate what functionality?
Back: Accessing variables from a `BEGIN` rule.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)
<!--ID: 1706885111454-->
END%%
%%ANKI
Basic
Describe what the following command does in in a single sentence:
```bash
$ awk 'program' pass=1 data pass=2 data
```
Back: Evaluates `'program'` against the `data` file twice with a different value of `pass` on each run.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)
<!--ID: 1706885111457-->
END%%
## References ## References
* Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf) * Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)

53
notes/gawk/variables.md Normal file
View File

@ -0,0 +1,53 @@
---
title: GAWK
TARGET DECK: Obsidian::STEM
FILE TAGS: linux::cli gawk
tags:
- gawk
---
## Variables
Variables are defined like `var=val`. They can be specified in two different places:
1. Via the `-v` command line flag. Using this allows accessing the variable value from within a `BEGIN` rule.
2. In the file list. Using this allows accessing the variable value in all subsequent file processing.
%%ANKI
Basic
Where in an `awk` invocation can variables be assigned?
Back: As a `-v` argument or in the file list.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)
<!--ID: 1706885111450-->
END%%
%%ANKI
Basic
The `-v` flag was introduced to accommodate what functionality?
Back: Accessing variables from a `BEGIN` rule.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)
<!--ID: 1706885111454-->
END%%
%%ANKI
Basic
Describe what the following command does in in a single sentence:
```bash
$ awk 'program' pass=1 data pass=2 data
```
Back: Evaluates `'program'` against the `data` file twice with a different value of `pass` on each run.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)
<!--ID: 1706885111457-->
END%%
%%ANKI
Basic
How is `stdin` specified in `awk`'s file list?
Back: `-`
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)
<!--ID: 1706973587236-->
END%%
## References
* Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. [https://www.gnu.org/software/gawk/manual/gawk.pdf](https://www.gnu.org/software/gawk/manual/gawk.pdf)

View File

@ -0,0 +1,13 @@
---
title: "2024-02-03"
---
- [x] Anki Flashcards
- [x] KoL
- [ ] Sheet Music (10 min.)
- [ ] OGS (1 Life & Death Problem)
- [ ] Korean (Read 1 Story)
- [ ] Interview Prep (1 Practice Problem)
- [ ] Log Work Hours (Max 3 hours)
* Spent time consolidating how different escape sequences behave across languages (`awk`, Bash, C, and Lua).

0
notes/lua/index.md Normal file
View File

View File

@ -90,7 +90,7 @@ callPackage = callPackageWith pkgs;
%%ANKI %%ANKI
Basic Basic
What two functions is `callPackage` implemented on top of? What two functions is `callPackage` implemented on top of?
Back: `callPackageWith` and `lib.makeOverridable`. Back: `callPackageWith` and `makeOverridable`.
Reference: Yin, Ryan. “NixOS and Flakes Book.” Nix, February 1, 2024. [https://github.com/ryan4yin/nixos-and-flakes-book](https://github.com/ryan4yin/nixos-and-flakes-book) Reference: Yin, Ryan. “NixOS and Flakes Book.” Nix, February 1, 2024. [https://github.com/ryan4yin/nixos-and-flakes-book](https://github.com/ryan4yin/nixos-and-flakes-book)
<!--ID: 1706828138592--> <!--ID: 1706828138592-->
END%% END%%
@ -98,7 +98,7 @@ END%%
%%ANKI %%ANKI
Basic Basic
What is the purpose of `callPackage`? What is the purpose of `callPackage`?
Back: It calls package functions with arguments automatic supplied if not overridden. Back: It calls package functions with arguments automatically supplied if not overridden.
Reference: Yin, Ryan. “NixOS and Flakes Book.” Nix, February 1, 2024. [https://github.com/ryan4yin/nixos-and-flakes-book](https://github.com/ryan4yin/nixos-and-flakes-book) Reference: Yin, Ryan. “NixOS and Flakes Book.” Nix, February 1, 2024. [https://github.com/ryan4yin/nixos-and-flakes-book](https://github.com/ryan4yin/nixos-and-flakes-book)
<!--ID: 1706828138594--> <!--ID: 1706828138594-->
END%% END%%