Sometimes it is useful to compare 2 files, to do this in awk, the trick is to first load the data from the first file into an array.
Let's say for instance that we have a list of first names in file1, one per line:
John
Mary
and a file2 with complete names:
John Smith
Mark Fo
Mary Bar
We want to find the names in file 2 corresponding to the first name in file1, this can be done in a compact manner like this:
awk 'FNR==NR {arr[$0];next} $1 in arr' file1 file2
Some explanations:
- FNR == NR: this test is true when the number of record is equal to the number of records in the file, this is only true for the first file, for the second file NR will be equal to the number of lines of file1 + FNR
- arr[$0]: this is a classic technique to create an array element index by the whole line, this will create an array with the first names of file1
- next: this will skip to the next record so that no more processing is done on file1
- $1 in arr: this will only happen on the records of file2 because of the next, this test if $1 is present in arr, ie in file1, if true the default action will be executed and the line will be printed.
Note: For this example, join(1) is a working alternative.