FindAllIndices

Sometimes it is useful to find the index of every occurrence of a given character in a string.

Let's say, for example, that you're trying to find doublequoted strings which may contain backslash-escaped quote characters. An easy way of doing this is..:

  1. find the index of every doublequote character;
  2. perform a values-to-keys inversion on the preceding array so that you then have a list of positions that point to sequence numbers;
  3. find the index of every backslash character;
  4. for each backslash, if the index + 1 of that backslash is in the list of doublequote positions, mark position for skipping;
  5. iterate through the list of all doublequotes, skipping the marked ones. Each odd-numbered index indicates the beginning of a doublequoted string and each even index, the end of one.
function findallindices(str, chr, arr,    i, j) {
    for (i = 1; i <= length(str); i++)
        if (substr(str, i, 1) == chr)
            arr[++j] = i
    return j
}

The function returns the number of instances found as a convenience (gsub() is a faster way of accomplishing this if that's all that is needed). The *arr* argument is expected to be an array in which all of the indices are accumulated. It is deliberately not zeroed out so that it can be filled with indices of different characters by repeated invocation.

The escaped-doublequote example was inspired by a common APL tactic.