17 KiB
title | TARGET DECK | FILE TAGS | tags | |
---|---|---|---|---|
Variables | Obsidian::STEM | linux::cli posix::awk |
|
Overview
Variables are defined like var=val
. They can be specified in two different places:
- Via the
-v
command line flag. Using this allows accessing the variable value from within aBEGIN
rule. - In the file list. Using this allows accessing the variable value in all subsequent file processing.
%%ANKI
Basic
Where in an awk
invocation can variables be assigned?
Back: As a -v
argument or in the file list.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
The -v
flag was introduced to accommodate what functionality?
Back: Accessing variables from a BEGIN
rule.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI Basic Describe what the following command does in in a single sentence:
$ awk 'program' pass=1 data pass=2 data
Back: Evaluates 'program'
against the data
file twice with a different value of pass
on each run.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is stdin
specified in awk
's file list?
Back: -
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
Predefined Variables
There exists a number of useful predefined variables:
NR
(Number of Records)- The 1-indexed number of records so far read.
- The count includes the current record.
FNR
(File Number of Records)- The 1-indexed number of records so far read from the current file.
- The count includes the current record.
%%ANKI
Cloze
The {NR
} variable specifies the {number of read input records}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Cloze
The {FNR
} variable specifies the {number of read input records for the current file}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
RT
(Record Text)- The matching separator used to distinguish the currently read record.
%%ANKI
Cloze
The {RT
} variable matches the {input characters that matched RS
}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
Tags: gnu::awk
END%%
%%ANKI
Basic
Barring the final record, when is RT
always equal to RS
?
Back: When RS
is a string containing a single character.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
Tags: gnu::awk
END%%
%%ANKI
Basic
Why is the gawk RT
variable unnecessary in POSIX awk
?
Back: Because POSIX awk
does not permit setting RS
to a regexp.
Tags: gnu::awk
END%%
RS
(Record Separator)- The separator used to distinguish records from one another.
- Defaults to
"\n"
.
RS == ?? |
Description |
---|---|
"\n" |
Records are separated by the newline character. This is the default. |
any single character | Records are separated by each occurrence of the character. Multiple successive occurrences delimit empty records. |
"" |
Records are separated by one or more blank lines. Leading/trailing newlines in a file are ignored. If FS is a single character, then "\n" also serves as a field separator. |
regexp | Records are separated by occurrences of characters that match regexp. Leading/trailing matches delimit empty records. (GNU only) |
%%ANKI
Cloze
The {RS
} variable is used to change the {record separator}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What is the default value of RS
?
Back: "\n"
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is RS = "ab"
interpreted in POSIX awk
?
Back: As if we had instead written RS = "a"
.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is RS = "ab"
interpreted in GNU awk
?
Back: As a regex matching strings "ab"
.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
Tags: gnu::awk
END%%
%%ANKI
Cloze
If RS
is a string with {more than one character}, it is treated as a {regexp}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
Tags: gnu::awk
END%%
%%ANKI
Basic
What value of RS
may gawk
not process correctly?
Back: A regexp with optional trailing part, e.g. AB(XYZ)?
.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
Tags: gnu::awk
END%%
%%ANKI
Basic
What implementation detail inspires avoiding RS = "\0"
?
Back: Most awk
implementations store strings internally as C-style strings.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What equivalent assignment do most awk
implementations interpret RS = "\0"
as?
Back: RS = ""
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is RS = ""
interpreted?
Back: ""
indicates one or more blank lines should be treated as the record separator.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What distinguishes RS
value ""
and \n\n+
?
Back: When set to the former, awk
strips leading/trailing newlines from the file.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
Tags: gnu::awk
END%%
%%ANKI
Basic
What distinguishes RS
value ""
and \n
?
Back: The former separates on one or more blank lines, not just a newline character.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What regexp is closest to mirroring RS = ""
behavior?
Back: \n\n+
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Cloze
If RS = ""
and FS
is set to {a single character}, the {newline character} always acts as a field separator.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
NF
(Number of Fields)- The 1-indexed number of fields found in the current record.
%%ANKI
Basic
What is the arithmetical value of ${NF + 1}
?
Back: 0
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What is the printed value of ${NF + 1}
?
Back: ""
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What value is $${NF + 1}
given when we run ${NF + 2} = "test"
?
Back: ""
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Cloze
The {NF
} variable specifies the {number of fields in the current record}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What two things does incrementing NF
do?
Back: Creates the field and rebuilds the record.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What two things does decrementing NF
do?
Back: Throws away fields and rebuilds the record.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
FS
(Field Separator)- The separator used to distinguish fields from one another.
FS == ?? |
Description |
---|---|
" " |
Fields are separated by runs of whitespace. Leading/trailing whitespace is ignored. This is the default. |
any other single character | Fields are separated by each occurrence of the character. Multiple successive occurrences delimit empty fields, as do leading/trailing occurrences. |
"\n" |
Specific instance of the above row. It is used to treat the record as a single field (assuming newlines separate records). |
regexp | Fields are separated by occurrences of characters that match regexp. Leading/trailing matches delimit empty fields. |
"" |
Each individual character in the record becomes a separate field. (GNU only) |
%%ANKI
Cloze
The {FS
} variable is used to change the {field separator}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Cloze
{FS
} is to awk
as {IFS
} is to Bash.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What is the default value of FS
?
Back: " "
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What value of FS
is specially handled?
Back: " "
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is FS = " "
interpreted?
Back: As a contiguous sequence of spaces, tabs, and newlines.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What distinguishes FS
value " "
and [ \t\n]+
?
Back: When set to the former, awk
strips leading/trailing whitespace from each record.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Cloze
Setting FS
to {""
} allows examining {each character of a record separately}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
Tags: gnu::awk
END%%
%%ANKI
Basic
How is FS = ""
interpreted in POSIX awk
?
Back: As a no-op.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is FS = ""
interpreted in GNU awk
?
Back: Each individual character in the record becomes a separate field.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
Tags: gnu::awk
END%%
%%ANKI
Cloze
If RS
has its default value, setting FS
to {"\n"
} treats the {record as the single field}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What value of FS
ensures $1 = $0
?
Back: RS
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
Why does awk
support a CSV mode?
Back: Because CSV fields may contain commas and newlines.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
OFS
(Output Field Separator)- Specifies the field separator used on printing.
%%ANKI
Cloze
The {OFS
} variable is used to change the {output field separator}.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
References
- Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf