4.2 KiB
title | TARGET DECK | FILE TAGS | tags | |
---|---|---|---|---|
Regular Expressions | Obsidian::STEM | linux::cli posix::awk regexp |
|
Overview
Most awk
patterns are regular expressions delimited with /
. We can use ~
and !~
to perform more complicated regexp filtering:
# Matches any line with `li` somewhere.
$ awk '/li/' data
$ awk '$0 ~ /li/' data
# Matches any line with `li` somewhere in the first field.
$ awk '$1 ~ /li/' data
awk
's implementation of regexps are a superset of posix/regexp.
%%ANKI Basic What is the result of the following?
$ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
Back: <A>bcd
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is the following equivalently written using ~
?
$ awk '/li/' data
Back:
$ awk '$0 ~ /li/' data
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What operator is used for regexp matching?
Back: ~
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
What operator is used for regexp non-matching?
Back: !~
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How do we write a pattern where the second field matches regexp /li/
?
Back:
$ awk '$2 ~ /li/' {...}
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Cloze
In awk
, /.../
is to a {regexp} constant whereas "..."
is to a {string} constant.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI Basic How are string constants processed differently from regexp constants? Back: The string constant is scanned twice. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI Basic What term describes a regexp that isn't a regexp constant? Back: A dynamic regexp. Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is *
escaped in a regexp constant?
Back: /\*/
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
How is *
escaped in a string constant (dynamic regexp)?
Back: "\\*"
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
%%ANKI
Basic
Why is it recommended to avoid using ^
and $$
in RS
?
Back: These anchors match the beginning and end of a string, not of a line.
Reference: Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf
END%%
References
- Robbins, Arnold D. “GAWK: Effective AWK Programming,” October 2023. https://www.gnu.org/software/gawk/manual/gawk.pdf