logo

awk Cheatsheet

Built-in

  • $1: awk reads and parses each line from input based on whitespace character by default and set the variables $1, $2 and etc.
  • NF: Number of Fields.
  • NR: Number of Records.
  • FNR: Number of Records relative to the current input file.
  • FS: Field Separator, any single character or regular expression, instead of the default whitespace. , in the script will be replaced with the field separator.
  • OFS: Output Field Separator.
  • RS: Record Separator.
  • ORS: Output Record Separator.
  • FILENAME: Name of the current input file.

Count Columns

If delimiter is ,

$ cat foo.txt | awk -F, '{print NF}'

or

$ awk 'BEGIN {FS=","} {print NF}' file.txt

If delimiter is \u0007(ctrl-v ctrl-g)

$ cat foo.txt | awk -F'^G' '{print NF}'

Get Column Number

replace <pattern> with the column name or pattern

head -1 foo.csv | awk -v RS="|" '/<pattern>/{print NR;}'

Print Rows by Number

print the second row:

$ awk 'NR==2' filename

print line 2 to line 10

$ awk 'NR==2,NR==10' filename

FS

FS can be set in either of these ways:

  • Using -F command line option. awk -F 'FS' 'commands' inputfilename
  • Awk FS can be set like normal variable. awk 'BEGIN{FS="FS";}'

Example to read the /etc/passwd file which has : as field delimiter.

$ cat etc_passwd.awk
BEGIN{
    FS=":";
    print "Name\tUserID\tGroupID\tHomeDirectory";
}
{
    print $1"\t"$3"\t"$4"\t"$6;
}
END {
    print NR,"Records Processed";
}

Then

$awk -f etc_passwd.awk /etc/passwd
Name UserID GroupID HomeDirectory
gnats 41 41 /var/lib/gnats
libuuid 100 101 /var/lib/libuuid
syslog 101 102 /home/syslog
hplip 103 7 /var/run/hplip
avahi 105 111 /var/run/avahi-daemon
saned 110 116 /home/saned
pulse 111 117 /var/run/pulse
gdm 112 119 /var/lib/gdm
8 Records Processed

OFS

Similar to FS but for outputs.

# without OFS, use whitespace by default
$ awk -F':' '{print $3,$4;}' /etc/passwd
41 41
100 101
101 102

# with OFS set to `=`
$ awk -F':' 'BEGIN{OFS="=";} {print $3,$4;}' /etc/passwd
41=41
100=101
101=102

RS

awk reads a line as a "record" by default. To split the text into records differently, set RS. E.g. if input.txt is like this:

A
1
101

B
2
202

C
3
303

Set RS to double new line characters (\n\n) so the input text is split into 3 records:

$cat script.awk
BEGIN {
    RS="\n\n";
    FS="\n";
}
{
    print $1,$2;
}

$ awk -f script.awk input.txt
A 1
B 2
C 3

ORS

The output equivalent of RS. The records will be printed with ORS as the separator instead of the default new line.

$ echo "A 1\nB 2\nC 3" | awk 'BEGIN{ORS="="}{print;}'
A 1=B 2=C 3=%

Extract a Column

$ cat file | awk '{print $2}'

Add awk in alias

awk '{print $1}'
alias aprint='awk "{print \$1}"'

Pattern Matching

# Find lines matching the pattern.
... | awk '/pattern/'

# Only print the first column.
... | awk '/pattern/ {print $1}'