RS

Record Separator

The special variable RS is a record separator that is used to determine how awk divides its input into records.

The default record separator is a newline character

The default record separator is a newline character, so by default each new line of data is treated as a new record, as in the following dataset contains 4 records:

Annie 3
Bobby 2
Charlie 4
Dave 3

The record separator can be changed by assignment or command line switch

The record separator can be changed by assignment like any other variable. This is often done in a begin block, before any input is read:

BEGIN {
  # Change the record separator to an exclamation mark
  RS = "!"
}

The -v command line switch enables the default record separator value to be changed at invocation time. Here we change the record separator to an empty string:

#!/bin/sh
awk -v'' '{ print $1,$3 }' /home/accounts/transdata.fil

Remember that shell interpolation needs to be considered when passing command line parameters this way.

Changing the record separator in the middle of processing an input file

If the value of the record separator is changed in the middle of processing an input file, then the new value of the record separator will be used as a delimiter for subsequent records. Note that an record currently being processed and any previous records will not be affected by the change of record separator.

Multiline Records

The awk interpreter supports the use of multiline records, by setting the record separator is set to an emptystring. When multiline records are being used, each line is treated as a field of data, and each record is separated by one or more blank lines. An empty line will be interpreted as the end of the record and multiple blank lines will be treated as a single record separator. Following a blank line, the next record will not begin until a nonempty line follows. The following example dataset contains two multiline records:

Annette Baxby
23 Luthton Road
London

Bobby Lewis
48 Dockside Row
Merseyside

Note that blank lines must be completely empty to be considered a record separator. Lines containing whitespace will be treated as part of a record and the end of the file will always be treated as the end of the record. If the last record is not followed by a blank line, the final newline will be discarded.

The newline character will always act as a field separator when multiline records are being. There is no way to prevent this behaviour, but it is possible to use the split function to extract fields as desired.

Setting the record separator as a nul character

Note that in some implementations of awk, it may not be possible to set the record separator to a literal [[nul?]] character, because [[nul?]] is treated as a string terminator in the underlying C library. The causes the record separator to be interpreted as an empty string.