notebook/notes/gawk/variables.md

9.8 KiB

title TARGET DECK FILE TAGS tags
Variables Obsidian::STEM linux::cli gawk
gawk

Overview

Variables are defined like var=val. They can be specified in two different places:

  1. Via the -v command line flag. Using this allows accessing the variable value from within a BEGIN rule.
  2. In the file list. Using this allows accessing the variable value in all subsequent file processing.

%%ANKI Basic Where in an awk invocation can variables be assigned? Back: As a -v argument or in the file list. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic The -v flag was introduced to accommodate what functionality? Back: Accessing variables from a BEGIN rule. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Describe what the following command does in in a single sentence:

$ awk 'program' pass=1 data pass=2 data

Back: Evaluates 'program' against the data file twice with a different value of pass on each run. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How is stdin specified in awk's file list? Back: - Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

Predefined Variables

There exists a number of useful predefined variables:

  • NR (Number of Records)
    • The 1-indexed number of records so far read.
    • The count includes the current record.
  • FNR (File Number of Records)
    • The 1-indexed number of records so far read from the current file.
    • The count includes the current record.

%%ANKI Cloze The {NR} variable specifies the {number of read input records}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Cloze The {FNR} variable specifies the {number of read input records for the current file}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

  • RS (Record Separator)
    • The separator used to distinguish records from one another.
  • RT (Record Text)
    • The matching separator used to distinguish the currently read record.

%%ANKI Cloze The {RS} variable is used to change the {record separator}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Cloze If RS is a string with {more than one character}, it is treated as a {regexp}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Cloze The {RT} variable matches the {input characters that matched RS}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Barring the final record, when is RT always equal to RS? Back: When RS is a string containing a single character. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What value of RS may gawk not process correctly? Back: A regexp with optional trailing part, e.g. AB(XYZ)?. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What implementation detail inspires avoiding RS = "\0"? Back: Most awk implementations store strings internally as C-style strings. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What equivalent assignment do most awk implementations interpret RS = "\0" as? Back: RS = "" Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

  • NF (Number of Fields)
    • The 1-indexed number of fields found in the current record.
  • FS (Field Separator)
    • The separator used to distinguish fields from one another.
    • Defaults to " " which is a special character for runs of spaces, tabs, and newlines.

%%ANKI Basic What is the arithmetical value of ${NF + 1}? Back: 0 Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What is the printed value of ${NF + 1}? Back: "" Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What value is ${NF + 1} given when we run ${NF + 2} = "test"? Back: "" Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Cloze The {NF} variable specifies the {number of fields in the current record}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Cloze The {FS} variable is used to change the {field separator}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Cloze {FS} is to awk as {IFS} is to Bash. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf /END%%

%%ANKI Basic What is the default value of FS? Back: " " Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What value of FS is specially handled? Back: " " Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic How is FS = " " interpreted? Back: As a contiguous sequence of spaces, tabs, and newlines. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What distinguishes FS value " " and [ \t\n]+? Back: When set to the former, awk strips leading and trailing whitespace from each record. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Cloze Setting FS to {""} allows examining {each character of a record separately}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Cloze Setting FS to {"\n"} treats the {record as the single field}. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic What value of FS ensures $1 = $0? Back: "\n" Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

%%ANKI Basic Why does awk support a CSV mode? Back: Because CSV fields may contain commas and newlines. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf

END%%

References