notebook/notes/gawk/index.md

12 KiB
Raw Blame History

title TARGET DECK FILE TAGS tags
GAWK Obsidian::STEM linux::cli gawk
gawk

Overview

%%ANKI Basic How was the name awk derived? Back: By taking the first initials of the original three creators.

END%%

%%ANKI Basic What does the term awk refer to? Back: Both the awk program and the awk language. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

Dark corners are basically fractal - no matter how much you illuminate, there's always a smaller but darker one. #quote

The above quote is attributed to Brian Kernighan (one of the authors of the C K&R book).

%%ANKI Cloze Dark corners are basically {1:fractal} - {1:no matter how much you illuminate, there's always a smaller but darker one.} - Brian Kernighan Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

Setup

Robbins suggests executing command set +H on bash startup to disable C shell-style command history.

Structure

awk applies actions to lines matching specified patterns. In this way awk is said to be data-driven - we specify the lines awk should act on and awk is responsible for finding and acting on them. Instructions are provided via a program.

%%ANKI Basic What is the basic function of awk? Back: To apply actions to lines matching specified patterns. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

An awk program consists of rules, each made up by a pattern and action. For example:

BEGIN { print "hello world" }
pattern { action }

%%ANKI Basic An awk program consists of a series of what? Back: Rules. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic A rule found in an awk program consists of what two parts? Back: A pattern and an action. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic A standalone awk program usually has what shebang? Back: #!/bin/awk -f Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Write the awk command that searches file mail-list for string li. Back:

$ awk '/li/' mail-list

Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How is an awk rule without a pattern interpreted? Back: As applying the specified action for every input line. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How is an awk rule without an action interpreted? Back: As printing every line matching the specified pattern. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Describe what the following command does in in a single sentence:

$ awk 'length($0) > 80' data

Back: Prints every line of data that is longer than 80 characters. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

awk reads in files in units called records. Each record is automatically broken up into chunks called fields. By default, a record corresponds to a single line. $0 would then refer to the entire line and $1 would refer to the first field of this line.

%%ANKI Basic In awk, what does a "record" refer to? Back: The unit that input is read in. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What is the default record separator? Back: The newline character. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic In awk, what does a "field" refer to? Back: A specific part of a record. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic By default, fields are separated by what? Back: A sequence of spaces, tabs, and newlines. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How does awk define whitespace? Back: As only spaces, tabs, and newlines. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How are fields referenced? Back: Via the $$ sign. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What is $0 a placeholder for? Back: An entire record. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What is $1 a placeholder for? Back: The first field of a record. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How can you remove trailing fields of $0? Back: Assign a smaller value to $NF. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How do you typically recompute the value of $0? Back: $1 = $1 Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Why does the following not output what you want?

$ ls -l | awk '{ OFS=":"; print $0 }'

Back: $0 wasn't recomputed so it maintains the previous OFS value. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How can you update the following to behave correctly?

$ ls -l | awk '{ OFS=":"; print $0 }'

Back:

$ ls -l | awk '{ OFS=":"; $1 = $1; print $0 }'

Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic When is the behavior of the field reference operator (i.e. $$) undefined? Back: When given a negative number. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How is the BEGIN pattern interpreted? Back: Code associated with it executes before any input is read. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How is the END pattern interpreted? Back: Code associated with it executes after all input has been read. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Describe what the following command does in in a single sentence:

$ awk 'NF > 0' data

Back: Prints every record in data with at least one field. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Describe what the following command does in in a single sentence:

$ awk 'END { print NR }' data

Back: Prints the number of records in data. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Describe what the following command does in in a single sentence:

$ awk 'NR % 2 == 0' data

Back: Prints every even-numbered record in data. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

awk is said to be a "line-oriented" language. Every rule's action must begin on the same line as the pattern.

%%ANKI Basic When can a rule's pattern and action exist on different lines? Back: Only when using backslash continuation. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What is the output of the following?

$ echo ' abc' | awk '{ print }'

Back: abc (with leading whitespace) Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What is the output of the following?

$ echo ' abc' | awk ' { $1 = $1; print }'

Back: abc (without leading whitespace) Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How is $$0 rebuilt after assignment $1 = $1? Back: By intercalating OFS between values of $1 through NF. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

Exit Status

On success, gawk exits with status code EXIT_SUCCESS. On failure, with status code EXIT_FAILURE. On fatal error, gawk exists with status code 2. #c

You can specify a custom exit status by using the exit statement from within the awk program.

References