Last edit
Summary: Added parametrization
Deleted:
< ==TODO TODO TODO: check and clean the code.
Changed:
< The above soultions were inspired by a snippet of code posted by radulouv on #awk (thanks) as a solution for i). The other codes are based on the same idea.
< With the same principle, we can now develop a solution for the "before match" part:
< i) Print the Nth record before some pattern:
< {{{ sh
< awk '/pattern/&&((NR-N) in a){print a[NR-N]} {a[NR]=$0;delete a[NR-N]}' file
< }}}
< ii) Print every record except the Nth record before some pattern:
to
> To make the solutions generic, you can pass N as a variable, eg
Changed:
< awk '!/pattern/&&((NR-N) in a){print a[NR-N]} {a[NR]=$0;delete a[NR-N]}' file
to
> awk -v N=3 '/pattern/{a[NR+N]}; NR in a' file
Changed:
< iii) Print the N records before some pattern, excluding the record:
to
> and you can also pass the pattern. However, using normal regex match as above can have problems if the pattern contains regular expression metacharacters, so in that case index() can be used instead (thanks prince_jammys from #awk):
Changed:
< TODO:
< }}}
< iiii) Print every record except the N records before some pattern:
< {{{ sh
< TODO:
to
> awk -v N=3 -v pattern="f(oo)*bar" 'index($0,pattern){a[NR+N]}; NR in a' file
Added:
> The above soultions were inspired by a snippet of code posted by radulouv on #awk (thanks) as a solution for i). The other codes are based on the same idea.
Problem: print the Nth record before or after a certain regular expression matches or, alternativley, print every record except the Nth before or after a certain regular expression matches. In the same way, print the N records before or after a certain match, or all records except the N before or after the match.
In the AllAboutGetline article, Ed Morton suggestes the following solutions for the "after match" part:
i) Print the Nth record after some pattern:
awk 'c&&!--c;/pattern/{c=N}' file
ii) Print every record except the Nth record after some pattern:
awk 'c&&!--c{next}/pattern/{c=N}' file
iii) Print the N records after some pattern:
awk 'c&&c--;/pattern/{c=N}' file
iiii) Print every record except the N records after some pattern:
awk 'c&&c--{next}/pattern/{c=N}' file
The problems with these approaches is that they use a single counter (c), so if matches are separated by less than N lines, then the result is not as expected.
An alternative approach could be to use a hash to store the numbers of the records to print:
i) Print the Nth record after some pattern:
awk '/pattern/{a[NR+N]}; NR in a' file
ii) Print every record except the Nth record after some pattern:
awk '/pattern/{a[NR+N]}; !(NR in a)' file
iii) Print the N records after some pattern, excluding the record:
awk '/pattern/{for(i=1;i<=N;i++)a[NR+i]}; NR in a' file
Use i=0 in the for definition if you want to include the matching record itself.
iiii) Print every record except the N records after some pattern:
awk '/pattern/{for(i=1;i<=N;i++)a[NR+i]}; !(NR in a)' file
To make the solutions generic, you can pass N as a variable, eg
awk -v N=3 '/pattern/{a[NR+N]}; NR in a' file
and you can also pass the pattern. However, using normal regex match as above can have problems if the pattern contains regular expression metacharacters, so in that case index() can be used instead (thanks prince_jammys from #awk):
awk -v N=3 -v pattern="f(oo)*bar" 'index($0,pattern){a[NR+N]}; NR in a' file
The above soultions were inspired by a snippet of code posted by radulouv on #awk (thanks) as a solution for i). The other codes are based on the same idea.