AWK quick reference
January 16, 1998 (Update 05/22/98)
:: ABBREVIATIONS:
cmd = command (shell)
expr(s) = expression(s)
fmt = format
param(s) = parameter(s)
patt(s) = pattern(s)
stat(s) = statement(s)
var(s) = variable(s)
:: Command line:
awk [-Fs] <'prog'|-f progfile>
[var=value] [file list]
:: Programs:
patt { action }
function name(param) { stat }
:: Patterns:
BEGIN
END
expr
/regex/
patt && patt
patt || patt
!patt
(patt)
patt, patt <-- a range
:: Actions:
break
continue
delete
do while (expr)
exit [expr]
expr
if (expr) stat
[else stat]
input-output stat
for (expr; expr; expr) stat
for (var in array) stat
next
return [expr]
while (expr) stat
{ stats }
:: Input-output:
close(expr) pipe/file expr
getline sets $0,NF,NR,FNR
getline file print to file
printf fmt,exprs format and print
printf fmt,exprs (idem) to file
system(cmd) exec cmd, return status
:: Print format conversions:
%c ASCII char
%d decimal
%e [-]d.d*E[+-]dd
%f [-]d*.d*
%g e or f whichever is shorter
%o unsigned octal
%s string
%x unsigned hexadecimal
%% print a %, no argument converted
additional parameters:
- left justify expr
width pad field, leading 0: w/zeros
.prec max string width or digits after decimal point
:: Built-in variables:
ARGC # of comm-line args
ARGV array of comm-line args [0..ARGC-1]
FILENAME name of curr input file
FNR input rec in curr file
FS input field separator (blank)
NF number of fields in input rec
NR input rec since the beginning
OFMT output fmt for numbers (%.6g)
OFS output field separator (blank)
ORS output rec separator (\n)
RLENGTH length of string matched by regex in match
RS input rec separator (\n)
RSTART beginning position of string matched by match (also
is the value
returned by match)
SUBSEP separator for array subscripts of form [i,j,...]
(\034)
:: Built-in string functions
In the following: s,t are strings, r a regex and i,n are
integers.
An "&" in the replacement string s in sub and gsub is replaced
by the
matched string.
gsub(r,s,t) returns # of subst. If no t, default $0
index(s,t) index of s in t, 0 if not
length(s) length of s
match(s,r) index of where s matches r, 0 if no match.
RSTART and
RLENGTH are set
split(s,a,fs) split s into array a on fs, return # of fields.
If no fs
then use FS
sprintf(fmt,exprs) format exprs
sub(r,s,t) like gsub, but only once
substr(s,i,n) return the n-char sub- string from i. If no n,
return
suffix of s starting at i
:: Built-in arithmethic functions
atan2(y,x) arctan of y/x in radians
cos(x) cos (angle in radians)
exp(x) e^x
int(x) truncate to integer
log(x) natural logarithm
rand(x) pseudo-rand [0-1>
sin(x) sine (angle in radians)
sqrt(x) square root
srand(x) new seed for rand, time of day used if no x
:: Expression operators (in increasing precedence)
= += -= *= /= %= ^= assignment
? : conditional operation
|| logical OR
&& logical AND
in array membership
~ !~ regex match, negated match
< <= > >= != == relationals
string concat, no explicit operator
+ - add, substract
* / % multiply, divide , mod
+ - ! unary plus/minus, logical NOT
^ exponentiation
++ -- increment/decrement (pre/postfix)
$ field
All operators are left associative, except assignment, ?: and ^,
which are
right associative. Parenthesis to group and change evaluation
order.
:: Regular expressions:
The regex metacharacters are
\ ^ $ . [ ] | ( ) * + ?
summary of metacharacters and matching
c matches the nonmetacharacter c
\c matches the escape sequence or literal character c
^ beginning of a string
$ end of string
. a single character
[abc...] char class
[^abc...] negated char class
r1|r2 alternation: any r1 or r2
(r1)(r2) concatenation
(r)* zero or more of r
(r)+ one or more of r
(r)? zero or one of r
(r) grouping
:: Escape sequences:
\b backspace
\f formfeed
\n newline
\r carrige return
\t tab
\ddd octal value ddd (1-3 digits)
\c any other char literally e.g. \"
:: Limits: (for the "one true awk")
100 fields
3000 chars per input record
3000 chars per output record
1024 chars per field
3000 chars per printf string
400 chars max literal string
400 chars in char class
15 open files
1 pipe
double-precision floating point
:: Initialization, comparison & type
coercion:
*Variables can potentially be a string or a number, or both at
any time.
*Assignment sets its type: var = expr
*In comparisons, if both operands are numeric, the comparison is
made
numerically. Otherwise the operands are coerced to string,
and the
comparison is made on strings.
*Numeric coercion: expr + 0
*String coercion: expr ""
*Uninitialized var have the numeric value 0 and the string value
"", so
if x is uninitialized:
(x) and (x=="0") are false
(!x), (x==0) and (x=="") are true
*The type of a field is determined by context when possible:
$1++ coerces
$1 to numeric, and $3 = $1 "," $2 coerces $1 and $2 to string
*Null fields: string ""
*Null array elements: string ""
*Mentioning variables causes them to exist with the values 0 and
""
e.g.
if (arr[i] == "") is true, because it creates arr[i]
if (i in arr) determines if arr[i] exists without creating it
---
<::>