BeforeAfterMatch

Problem: print the Nth record before or after a certain regular expression matches or, alternativley, print every record except the Nth before or after a certain regular expression matches. In the same way, print the N records before or after a certain match, or all records except the N before or after the match.

In the AllAboutGetline article, Ed Morton suggestes the following solutions for the "after match" part:

i) Print the Nth record after some pattern:

awk 'c&&!--c;/pattern/{c=N}' file

ii) Print every record except the Nth record after some pattern:

awk 'c&&!--c{next}/pattern/{c=N}' file

iii) Print the N records after some pattern:

awk 'c&&c--;/pattern/{c=N}' file

iiii) Print every record except the N records after some pattern:

awk 'c&&c--{next}/pattern/{c=N}' file

The problems with these approaches is that they use a single counter (c), so if matches are separated by less than N lines, then the result is not as expected.

An alternative approach could be to use a hash to store the numbers of the records to print:

i) Print the Nth record after some pattern:

awk '/pattern/{a[NR+N]}; NR in a' file

ii) Print every record except the Nth record after some pattern:

awk '/pattern/{a[NR+N]}; !(NR in a)' file

iii) Print the N records after some pattern, excluding the record:

awk '/pattern/{for(i=1;i<=N;i++)a[NR+i]}; NR in a' file

Use i=0 in the for definition if you want to include the matching record itself.

iiii) Print every record except the N records after some pattern:

awk '/pattern/{for(i=1;i<=N;i++)a[NR+i]}; !(NR in a)' file

To make the solutions generic, you can pass N as a variable, eg

awk -v N=3 '/pattern/{a[NR+N]}; NR in a' file

and you can also pass the pattern. However, using normal regex match as above can have problems if the pattern contains regular expression metacharacters, so in that case index() can be used instead (thanks prince_jammys from #awk):

awk -v N=3 -v pattern="f(oo)*bar" 'index($0,pattern){a[NR+N]}; NR in a' file

The above soultions were inspired by a snippet of code posted by radulouv on #awk (thanks) as a solution for i). The other codes are based on the same idea.