awk
awk / nawk / gawk
Syntax
awk '/Pattern1/ {Actions}
/Pattern2/ {Actions}' File1 File2
Pattern is a regular expression.
Actions are the statement(s) to be performed.
several patterns and actions are allowed in Awk.
File is the Input file. Usually you would use one input file but you can specify multiple files.
Single quotes are used around the program to prevent shell interpreting any special characters.
Awk reads the input files one line at a time.
For each line, it matches with given pattern in the given order, if matches performs the corresponding action.
If no pattern matches, no action will be performed.
In the above syntax, either search pattern or action are optional, But not both.
If the search pattern is not given, then Awk performs the given actions for each line of the input.
If the action is not given, print all that lines that matches with the given patterns which is the default action.
Empty braces with out any action does nothing. It wont perform default printing operation.
Each statement in Actions should be delimited by semicolon.
You can also save your awk program to a file, which can make the syntax more readable. Run it like this...
awk -f program.awk File
Examples
Field Separators and substititions
To set APXPRT to the first Port the current listener is listening on...
APXPRT=$(lsnrctl stat $LISTENER_NAME | grep PORT | head -1 | awk -F"PORT=" '{sub(/ .*/,"",$2);print $2}' | sed 's/)//g')
Explanation...
lsnrctl stat $LISTENER_NAME | grep PORT | head -1
yeilds
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=myhost)(PORT=1523)))
the subsequent awk command then uses -F to set the field separator to PORT=, uses sub to remove any spaces, and prints the second field. sed is then used to remove trailing parentheses, giving the result...
1523
Print field using record separators
Line in file to process...
one-two-three-four
Command we run...
cat myfile | awk -F '-' '{print $3}'
Result...
three
Print field using two record separators
Line in file to process...
<one><two><three><four>
Command we run...
cat myfile | awk -F '[>=<]' -v OFS='>' '{print $4}'
Result...
two
Real world example...
Line in file to process...
<property name="hibernate.connection.password">password</property>
Command we run...
cat confluence.cfg.xml | grep hibernate.connection.password | awk -F '[>=<]' -v OFS='>' '{print $4}'
Result...
password
Selecting all lines between two patterns
opatch lsinventory -all | awk '/List of Oracle Homes:/,/Installed Top-level Products/'
if.. then.. else
To compare two numbers and print something based on the result...
echo ${MYFSSZ} ${fre} | awk '{ if ($1 > $2) print "*** VG too small!"; else print "*** OK"; }'
For more complex examples see:
https://www.thegeekstuff.com/2010/02/awk-conditional-statements/While
To print a line containing 50 x's...
awk 'BEGIN { while (count++<50) string=string "x"; print string }'
https://www.thegeekstuff.com/2010/02/unix-awk-do-while-for-loops/
Do.. While
awk 'BEGIN{count=1;do print "This gets printed at least once";while(count!=1)}'
https://www.thegeekstuff.com/2010/02/unix-awk-do-while-for-loops/
For...
Break
This script prints "Iteration" ten times then stops...
awk 'BEGIN{x=1;while(1){print "Iteration";if ( x==10 ) break;x++;}}'
https://www.thegeekstuff.com/2010/02/unix-awk-do-while-for-loops/
Continue
This script prints ten lines but skips the 5th one...
awk 'BEGIN{x=1;while(x<=10){if(x==5){x++;continue;}print "Value of x",x;x++;}}'
https://www.thegeekstuff.com/2010/02/unix-awk-do-while-for-loops/
Exit
This script starts to print 10 lines but exits from the loop after the 5th line...
awk 'BEGIN{x=1;while(x<=10){if(x==5){exit;}print "Value of x",x;x++;}}'
https://www.thegeekstuff.com/2010/02/unix-awk-do-while-for-loops/
Pruning
Shutdown...
adrci exec=show alert -p "originating_timestamp > systimestamp-1 and message_group like '%ddl%'" -term | \
nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r[(NR-c+1)%b];print;c=a}b{r[NR%b]=$0}' b=1 a=0 s="^ALTER DATABASE CLOSE"
Startup...
adrci exec=show alert -p "originating_timestamp > systimestamp-1 and message_group like '%ddl%'" -term | \
nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r[(NR-c+1)%b];print;c=a}b{r[NR%b]=$0}' b=1 a=0 s="^Completed: ALTER DATABASE OPEN"
Explanation
The Patterns (c-->0;$0~s) are not enclosed in / but specify that the Action should only be performed if c-1 is greater than 0 and the line matches the string contained in variable s
Variables b, a, and s are set outside the Action.
b is a variable to hold the number of lines to print before the matched line (s)
a is a variable to hold the number of lines to print after the matched line (s)
c-- means c=c-1
NR=ordinal number of the current record
Counting fields in a file
awk -F ':' '{ total += NF }; END { print total }' /etc/passwd
Explanation
awk steps through an input file line by line
-F defines the field separator
total is a user variable to hold the count
NF is the Number of Files
END is a special pattern which runs at the end of the script
Counting lines where last field matches a pattern
awk -F ':' '$NF ~ /\/bin\/sh/ { n++ }; END { print n }' /etc/passwd
Explanation
awk steps through an input file line by line
-F defines the field separator
$NF is the field at the end of a line
~ is the regular expression "match" operator
Regular expressions should be enclosed in / (which means any other / need to be escaped by \)
n is a user variable to hold the count
END is a special pattern which runs at the end of the script
Find line with the highest value for a field
awk -F ':' '$3 > maxuid { maxuid=$3; maxline=$0 }; END { print maxuid, maxline }' /etc/passwd
Explanation
awk steps through an input file line by line
-F defines the field separator
$3 is the third field on the current line (this is the field we want to find highest value for)
$0 is the entire current line
maxuid is a user variable to hold the highest value found so far
maxline is a user variable to hold the entire line containing the highest value found so far
if $3 is greater than maxuid then update maxuid and maxline with the values from the current line
END is a special pattern which runs at the end of the script
Print even numbered lines
awk 'NR % 2 == 0' /etc/passwd
Explanation
awk steps through an input file line by line and prints the lines where the pattern is matched
NR is the line number
if the remainder after dividing NR by 2 equals 0 the pattern is matched otherwise the line is ignored
Find lines where fields match
awk -F ':' '$3==$4' passwd.txt
Explanation
awk steps through an input file line by line and prints the lines where the pattern is matched
-F defines the field separator
where field 3 equals field 4 the pattern is matched otherwise the line is ignored
Find lines where two conditions match
awk -F ':' '$3>=100 && $NF ~ /\/bin\/sh/' passwd.txt
Explanation
-F defines the field separator
where field 2 is greater than or equal to 100 and the last field on the line ($NF) matches (~) /bin/sh (/ escaped with \ as necessary) print line
Find lines with empty fields
awk -F ':' '$5 == "" ' passwd.txt
Explanation
-F defines the field separator
where field 5 equals null ("") then print line
Remove blank lines
awk 'NF' filename
Number of lines in a file as a percentage of the number of lines in another file
echo "$(cat file1|wc -l)|$(cat file2}|wc -l)" | awk 'BEGIN { FS = "|" } { $3 = $1 / $2 * 100 "%" } { print $3 }'
Using a program file
Create a file called etc_passwd.awk...
BEGIN{
FS=":";
print "Name\tUserID\tGroupID\tHomeDirectory";
}
{
print $1"\t"$3"\t"$4"\t"$6;
}
END {
print NR,"Records Processed";
}
Run it using...
awk -f etc_passwd.awk /etc/passwd
Explanation
The file separator has bee set within the script. i.e. FS=":"; is equivalent to using -F ":" on the command line.
\t prints tabs between the literal strings in the first print command
The BEGIN action block is performed once before the file is processed (to give headings in this case)
The END action block is performed once after the file is processed (to give a record count in this case, using the NR Built-In variable).