BeforeAfterMatch

Last edit

Summary: Added parametrization

Deleted:

< ==TODO TODO TODO: check and clean the code.

Changed:

< The above soultions were inspired by a snippet of code posted by radulouv on #awk (thanks) as a solution for i). The other codes are based on the same idea.
< With
the same principle, we can now develop a solution for the "before match" part:
< i) Print the Nth record before some pattern:
< {{{ sh
< awk '/pattern/&&((NR-
N) in a){print a[NR-N]} {a[NR]=$0;delete a[NR-N]}' file
< }}}
< ii) Print every record except the Nth record before some pattern:

to

> To make the solutions generic, you can pass N as a variable, eg

Changed:

< awk '!/pattern/&&((NR-N) in a){print a[NR-N]} {a[NR]=$0;delete a[NR-N]}' file

to

> awk -v N=3 '/pattern/{a[NR+N]}; NR in a' file

Changed:

< iii) Print the N records before some pattern, excluding the record:

to

> and you can also pass the pattern. However, using normal regex match as above can have problems if the pattern contains regular expression metacharacters, so in that case index() can be used instead (thanks prince_jammys from #awk):

Changed:

< TODO:
< }}}
< iiii) Print every record except the
N records before some pattern:
< {{{ sh
< TODO:

to

> awk -v N=3 -v pattern="f(oo)*bar" 'index($0,pattern){a[NR+N]}; NR in a' file

Added:

> The above soultions were inspired by a snippet of code posted by radulouv on #awk (thanks) as a solution for i). The other codes are based on the same idea.


Problem: print the Nth record before or after a certain regular expression matches or, alternativley, print every record except the Nth before or after a certain regular expression matches. In the same way, print the N records before or after a certain match, or all records except the N before or after the match.

In the AllAboutGetline article, Ed Morton suggestes the following solutions for the "after match" part:

i) Print the Nth record after some pattern:

awk 'c&&!--c;/pattern/{c=N}' file

ii) Print every record except the Nth record after some pattern:

awk 'c&&!--c{next}/pattern/{c=N}' file

iii) Print the N records after some pattern:

awk 'c&&c--;/pattern/{c=N}' file

iiii) Print every record except the N records after some pattern:

awk 'c&&c--{next}/pattern/{c=N}' file

The problems with these approaches is that they use a single counter (c), so if matches are separated by less than N lines, then the result is not as expected.

An alternative approach could be to use a hash to store the numbers of the records to print:

i) Print the Nth record after some pattern:

awk '/pattern/{a[NR+N]}; NR in a' file

ii) Print every record except the Nth record after some pattern:

awk '/pattern/{a[NR+N]}; !(NR in a)' file

iii) Print the N records after some pattern, excluding the record:

awk '/pattern/{for(i=1;i<=N;i++)a[NR+i]}; NR in a' file

Use i=0 in the for definition if you want to include the matching record itself.

iiii) Print every record except the N records after some pattern:

awk '/pattern/{for(i=1;i<=N;i++)a[NR+i]}; !(NR in a)' file

To make the solutions generic, you can pass N as a variable, eg

awk -v N=3 '/pattern/{a[NR+N]}; NR in a' file

and you can also pass the pattern. However, using normal regex match as above can have problems if the pattern contains regular expression metacharacters, so in that case index() can be used instead (thanks prince_jammys from #awk):

awk -v N=3 -v pattern="f(oo)*bar" 'index($0,pattern){a[NR+N]}; NR in a' file

The above soultions were inspired by a snippet of code posted by radulouv on #awk (thanks) as a solution for i). The other codes are based on the same idea.