awk Cheatsheet
Built-in
$1
:awk
reads and parses each line from input based on whitespace character by default and set the variables$1
,$2
and etc.NF
: Number of Fields.NR
: Number of Records.FNR
: Number of Records relative to the current input file.FS
: Field Separator, any single character or regular expression, instead of the default whitespace.,
in the script will be replaced with the field separator.OFS
: Output Field Separator.RS
: Record Separator.ORS
: Output Record Separator.FILENAME
: Name of the current input file.
Count Columns
If delimiter is ,
$ cat foo.txt | awk -F, '{print NF}'
or
$ awk 'BEGIN {FS=","} {print NF}' file.txt
If delimiter is \u0007
(ctrl-v ctrl-g
)
$ cat foo.txt | awk -F'^G' '{print NF}'
Get Column Number
replace <pattern>
with the column name or pattern
head -1 foo.csv | awk -v RS="|" '/<pattern>/{print NR;}'
Print Rows by Number
print the second row:
$ awk 'NR==2' filename
print line 2 to line 10
$ awk 'NR==2,NR==10' filename
FS
FS
can be set in either of these ways:
- Using -F command line option.
awk -F 'FS' 'commands' inputfilename
- Awk FS can be set like normal variable.
awk 'BEGIN{FS="FS";}'
Example to read the /etc/passwd
file which has :
as field delimiter.
$ cat etc_passwd.awk
BEGIN{
FS=":";
print "Name\tUserID\tGroupID\tHomeDirectory";
}
{
print $1"\t"$3"\t"$4"\t"$6;
}
END {
print NR,"Records Processed";
}
Then
$awk -f etc_passwd.awk /etc/passwd
Name UserID GroupID HomeDirectory
gnats 41 41 /var/lib/gnats
libuuid 100 101 /var/lib/libuuid
syslog 101 102 /home/syslog
hplip 103 7 /var/run/hplip
avahi 105 111 /var/run/avahi-daemon
saned 110 116 /home/saned
pulse 111 117 /var/run/pulse
gdm 112 119 /var/lib/gdm
8 Records Processed
OFS
Similar to FS
but for outputs.
# without OFS, use whitespace by default
$ awk -F':' '{print $3,$4;}' /etc/passwd
41 41
100 101
101 102
# with OFS set to `=`
$ awk -F':' 'BEGIN{OFS="=";} {print $3,$4;}' /etc/passwd
41=41
100=101
101=102
RS
awk
reads a line as a "record" by default. To split the text into records differently, set RS
. E.g. if input.txt
is like this:
A
1
101
B
2
202
C
3
303
Set RS
to double new line characters (\n\n
) so the input text is split into 3 records:
$cat script.awk
BEGIN {
RS="\n\n";
FS="\n";
}
{
print $1,$2;
}
$ awk -f script.awk input.txt
A 1
B 2
C 3
ORS
The output equivalent of RS
. The records will be printed with ORS
as the separator instead of the default new line.
$ echo "A 1\nB 2\nC 3" | awk 'BEGIN{ORS="="}{print;}'
A 1=B 2=C 3=%
Extract a Column
$ cat file | awk '{print $2}'
Add awk in alias
awk '{print $1}'
alias aprint='awk "{print \$1}"'
Pattern Matching
# Find lines matching the pattern.
... | awk '/pattern/'
# Only print the first column.
... | awk '/pattern/ {print $1}'