Swiftorial Logo
Home
Swift Lessons
Tutorials
Learn More
Career
Resources

Using awk

Introduction

awk is a powerful programming language that is primarily used for pattern scanning and processing. It is named after its creators Alfred Aho, Peter Weinberger, and Brian Kernighan. awk is typically used as a data extraction and reporting tool. It is often used in combination with other command-line utilities in Unix-like operating systems.

Basic Syntax

The basic syntax of an awk command is:

awk 'pattern { action }' input-file

Here, pattern specifies the pattern to search for in the input file, and action specifies what to do when a line matches the pattern. If no pattern is specified, the action is applied to all lines.

Printing Lines

One of the simplest tasks you can perform with awk is printing lines from a file. The print statement is used to print lines.

awk '{ print }' file.txt

This command prints all lines from file.txt.

Pattern Matching

awk can match patterns using regular expressions. For example, to print only lines that contain the word "error", you can use:

awk '/error/ { print }' file.txt

This command prints all lines from file.txt that contain the word "error".

Field Processing

awk treats each line of input as a series of fields. By default, fields are separated by whitespace. You can refer to these fields using $1, $2, etc.

For example, to print the first and third fields of each line, you can use:

awk '{ print $1, $3 }' file.txt

Specifying Field Separators

You can specify a different field separator using the -F option. For example, to use a comma as the field separator:

awk -F ',' '{ print $1, $2 }' file.csv

This command prints the first and second fields of each line from file.csv, assuming fields are separated by commas.

Awk Variables

awk provides several built-in variables that can be useful:

  • NR: Number of the current record (line).
  • NF: Number of fields in the current record.
  • FS: Field separator.
  • OFS: Output field separator.

For example, to print the line number along with each line, you can use:

awk '{ print NR, $0 }' file.txt

Conditional Statements

awk supports conditional statements for more complex logic:

awk '{ if ($1 > 10) print $0 }' file.txt

This command prints lines where the first field is greater than 10.

Loops

awk also supports loops, such as for, while, and do-while. For example, to print each field of a line on a new line, you can use:

awk '{ for (i = 1; i <= NF; i++) print $i }' file.txt

Built-in Functions

awk provides many built-in functions for string and numeric operations, such as length(), substr(), index(), and split(). For example, to print the length of each line:

awk '{ print length($0) }' file.txt

Writing Scripts

You can write awk scripts in a file and run them using the -f option. For example, create a file script.awk with the following content:

BEGIN { FS = "," }
{ print $1, $2 }
END { print "Done" }
                

Then run the script with:

awk -f script.awk file.csv

Conclusion

This tutorial covered the basics of using awk for text processing. awk is a powerful tool that offers many features for pattern matching, field processing, and more. By mastering awk, you can efficiently handle various text processing tasks in Unix-like systems.