Last edit
Summary: Added idiomatic NR==FNR{}{} solution
Added:
> Note that the problem might also be solved using the classical awk idiom to process two files:
> {{{
> awk 'NR==FNR{BL[$1]=$0;next}$3 in BL{print BL[$3]}' bit.txt file.txt
> }}}
> For more information on awk idioms (including the previous one), see [[http://awk.freeshell.org/AwkTips#toc1|Typical awk idioms]]
This was difficult to hash out. Eventually the task was described as, "for each value in $3 of file.txt, print out the line from bit.txt where bit.txt:$1 == file.txt:$3".
Since $1 of bit.txt is always equal to NR - that is, the first field on each line of bit.txt is simply the line number - the problem becomes quite easy.
#!/usr/bin/awk -f BEGIN { BIT = "bit.txt" FILE = "file.txt" ERRSTR = "Guarantee violated, no value for %s in %s (%s, line %s)\n" while ((getline $0 < BIT) > 0) BL[$1] = $0 close(BIT) while ((getline $0 < FILE) > 0) print (($3 in BL) ? BL[$3] : sprintf(ERRSTR, $3, BIT, FILE, $1)) close(FILE) # equivalently in terms of semantics: # # sort -nk 3,3 file.txt | join -1 3 -2 1 -e "EMPTY" - bit.txt # # Note that the above command assumes that bit.txt is already sorted on the first # field. Since that first field is supposed to be the line number, this assumption # is more or less reasonable; however, in the interest of avoiding stupid errors, # this should be checked. And then double-checked. # # The output format will be different, of course, but this is a matter of messing # around with the '-o FORMAT' option of the GNU join command, which is documented # at http://www.gnu.org/software/coreutils/manual/coreutils.html#join-invocation. }
Note that the problem might also be solved using the classical awk idiom to process two files:
awk 'NR==FNR{BL[$1]=$0;next}$3 in BL{print BL[$3]}' bit.txt file.txt
For more information on awk idioms (including the previous one), see Typical awk idioms
Bas64-encoded; simply run wget -qO- http://awk.freeshell.org/FileAndBitJoinClone
| sed -ne '/^begin-base64/,/^====$/p'
| uudecode -o-
| tar -zxv
to create the two files 'file.txt' and 'bit.txt' in the current directory.
begin-base64 644 ajai-files.tar.gz H4sICIbrakgCA2FqYWktZmlsZXMudGFyAO2aa67cyLGE9Xe0it7AEJWZ9VyO DdiGAT+Ae8eAl+9kRUqoJMP2rzFsQA1IOmKcJqOCxeBX7P7tH3+5fvn7L19+ zVfxV6/1/lf9z/mvv8RaL19Eimnrpr19KSLD7MunfPkPvP72/7/85v8+ny9/ +Mtf//zXv/zz3/t3+v/oq/zkL/z1VY6f9fjZjp/r8XM7fu7Hz+P4eR4/r/NY 6cDnkeU8tJzHlvPgch5dzsPLeXw5DcjpQE8HmsZ+OtDTgZ4O9HSgpwM9Hejp QE8Hdjqw04Gl+E8Hdjqw04GdDux0YKcDOx3U00E9HdTTQU0zYDuYa/8jbX6t p4t6uqini3q6qKeLVh57bKeTdjpp9vzVNCHbUz3dtAHVvquno3Y66uXxq307 GnNhQ/vaT1fdnr9+uurbVYMXUfva+3PDmVM/XfWFPeu3PY/zjI3tCpn4bobm /55nbZyORuQk3/eaLt/tRjAdavs65uP/Z1Lz9DPP8zbPhCYSwqV5H3FiFsn3 gU14Ku37htPTxLnbb1L9Os+M5ulnlfSL63S0TkfrTGed6aynk3U6Wee5Wqne 1uN9UlLJle2lY7JIGb4lVV1JXVdS2WGn6b39tSX1XknFV7Y3HXFlqFduLuDU wLmCcwfnEs4tnGsYJ7tjY/c4chejjAVVZ+a9X54b8m0hmUItV5xoa74h+Urd LKmcBe18vjX5ShUtqaMllbRYvlOlmFJPSypqSU0tqaoldbWkspbU1pLqWlJf S823zeQl1bWkvpZU2JIaW1JlS0teUmELGrvD0lq+Idlp+Tae7KTClpbspLqW 1NfSk52e7PSnnZ7soKtxWZv4/zNY9IeaLKWuFpT1cSCU9bkhGRtPY6mxBRfr Kbfnhow94yknd6m2JfW2pOKW1Nwyk6mZTt5MWaGxFe0xu2/IJJbspNaWleyg tY8dxf0PLVX9NKT2lrjhYxTDy20lX+jvU06+UOKnnDGxZFnR4+eGxIupxjXV uJaEjCUxY6pvTfWtJTlCdbcVt1DfkOgVsZ2yPTckU+jwU06+Ek4rOvz85fXY kLkaHX7KT3f6dIcmPzc8DWbYRp+fcspOnwZTpas9DVqm/6c7e7pL3a7o9lNO 8aV611TvmupdU71rqneteU2Szmaqd61PStBU8YqKT3rylGpeU81rAnNNHa8t L5Se9KLt5St1vaau19T1mrpeU9drwnJNRa8JyrXn1VuaUankNZW89uQl4bim eteRvKRu14TjOpKXkZeSyUuqdE2VrqnSNVW6pkrXVOmaKl1TpWuCcJ15XZu8 pD7X1OeaKFwThmtqck0grqnGNWG4pg5XdPjB3rrySjsttUtaa6f6tlTflurb ynNlYKjwtCWtvMt86ckXqjxI4JZR5eeG5C6xuKHHh31fVVnicUORJz25S0Bu 6emIpRo3zU8n9Lnf9JTEEotbYnFL3W3pSYml4rb8rAStfR4zPzGxl6f83MRe WeWnJ/bKKj9DQXMnPfmrL3+pwe1ocKj5qU5KLHW41Zez1OGWMN3Q3+dvpw63 1OHWXqmlHjf0eICXr6p8S3ttSWc09bilHrfU44Ye7ygVX+r4luQtuP3Ukzf0 edLba0vyhl5PevKXut1St1vqdhsvb6nfbby8pY630V96yg09n/T13JL63tD3 SU/Ngc5Peppz85XdfHmc+aliyi71v63XuU33AFuv/NJ9wNYrv/Xyl+4Hlu4H tl75rfzc83m1VtwT0hZ9bbHXlvro9/q6R9SE+f9tz/9//8c//e7X/gDoX3/+ 4+yl+v3zH7u3iy+N2o/Pf/4Tr8/9kp9+nl2v6vH3Ln7jLP3jN6yvW9Sffl7D xTrGFKm+CJgfKQui/fSznzi5Zl+z9eor9/XpAq2GtqrD7yxjrfppA1ojWi/Q +tbatfwmIw5EfiP+tAZtbK1fXj7LFwJuyJ22CnFSMd65brGuq7TSHGdmmeUz oDmEudbK5WNu0xl7jP7pGKHf527Nx998AVGkFasfx02ISkSnRIhGxQ6xUjF2 24joVQOxUzGOOYi4QptMi1HudJpdbfml6nf0IeXjeLlFLVTEQFSYqJg7qkw0 nGc1JrZ4ZyWitzbERkVMO+1UjN0OKhrEHVCXq4zpiyLZM72Hdgfk8+ca9xXg ryrjU5GPFaI1WDVhGvZpdzri89Uvmul8utYYnxnaHY40uZpfka26k+bzwzAK u8OR0a46mq+ftAy/6nwHEBsTK86z3eH43Pfx+y3S12y1ysdhBuJgYkx1u8NR HVedqwwnVF/GfTpmli2mYX7UOxxfwF+zqvNXXVqHX1SwU4WJiujqHY8v06++ VnO8KT53PwP9UY1pcFrvdHwpfpXp0DV9LFU/mBu1MSkO16kW4qAicquTijiR dVER56oVKmKMTaiIQTZlIkbZjGoYSaPpfHsnzUcwzEYDEgyz0YCiQBoNSDB5 Gg0o2qXTgAQZdBoQzHaaj2KYnQYUndVpQIqRdBqQYiSdBqQxEhqQxkhoQIqL pNOAEPug+UTBDpqPIYNBA4qCGTQgi2PSgKJ9Bg3IENCgARkCGjQgw7UwaECG 9AYPCOlNmhDMThpQRXqTBlSR3qQBBTFNGlBFepMGFHeSSQOqSG/SgCrSmzSg ivQmDagivcUDQnqLJgQ/iwYU971FA2pIb9GA4q64aECBeIsGFIi3aEANAS0a UODoogG1HdD9IT4TF0QaUIdGAwLl3h/2M1Eg0oCAIvdXAZjYINKA+oBIA+ox TBpQvJHmM2BWaD4DZoXmMxCQ0IBwGxehAQ3ELjSgeCPNZyJ2ofnMGAnNZypE ms+MYdKA5p60ojSgiROmNKCJ3JUGNDFOpQFNBKQ0oNBoQAsBKQ1oYZhKA1ox EhrQipHQgLD6EGMBIXVj8QAQxZRq8UajIsZhlYo4XdaoiEFapyLmsw0qIgGb VMS5tEVFxFNpPJiyleYjIdKAcPOTSgOK6qIMrdEGlKJBT0IpWuOEUYpWjUPS fABIQilacTIpRCsYSChEKzBHKEQrYEUoRSt4RChFK3hEKEUreEQoRYMuhUK0 1hgJzQc8IhSiFTwiFKIVPCIUohU8IpSiFTwilKIVyCGUohVUIZSiNfZK8wFV CIVoBVUIhWhtkQENCMghlKIVyCGUohXgIJSiNcCBUrQGOFCKjkuBQrQGOFCI 1gAHCtHakQGFaA2qoBCtHRlQiNZADkrRiuW/UIrW4BFK0Ro8MnlASI9StMZe aUAD6VGI1iAZCtGKpyNCIVoDcyhE60B6FKIVzx6FUrQGIFGKVjwDEUrRGvS0 eEBIj1J0NC2FaA16ohCtQU8UohX0pBSiFfSkFKIV9KSUohX0pJSidXaINCCg lVKKVqCVFh7QgkgTCo0GBLRSStG6dnpKKVoX0qMUrYAypRStC+lRilYQm1KK 1oX0KEYrcE4pRitwTilGh1eWD27VShna8ABOKUMbHvkqZWgDCCplaAMIKmVo AwgqZWgDzimFaAOxKYVoC7M0HQmzNB8JszQgrMaVMrSB2JRStGG9pBSjDWsF pRgNfFJK0QbWU0rRBmpVStEGnlNK0QZEUkrRppEBDQg3VaUUbbhnKMVoQyUq xWiLK55iNJ6UKaVoi3lAKdq+vZMGBDJVitFmMRIakIVZGhCSpRht4EulGG1A SKUYbTV2S/MBQirFaANCKsVow1MrpRgd+VCMNiCkUoy2FsOk+YASlWK04fGS Uow2UKJSjDZQolKMNlCiUoy2OCTNByCoFKOtx0hoQD12SwPqYZYGBChTitEx LSlFG6BMKUUboEwpRRu4SylFG9BKKUbbCEM0IIySUrTNMEvzCcyhFG1BMpSi LXiEUrQFj1CKtuARStFxwVOItkAOCtEWVEEp2oINKEVbsAGlaAs2oBSNeChD Y/GrFKErHrsoRegoLkrQWBcrBejoNMrP0VoUn2t4YclEoVF6jj6j8IzlslF2 bpBYLigzo+CM5bBRbsaC1yg2o+aMUjOKzCg0txgCiyV2yVLpYZOlgjWpUWDG wtIoL6PCjOJyvI2lgo4yyspoIaOoPMImSwXLO6OgjDWaUVKOEbBU0E5GORkL NKOYjOIySslYRxmFZDSTUUZGMRlF5JBYKitsslRW2GSpoLGM4jGWQUbpGGVm FI7RZUbZGN+IsEDjfq3Sy5JRXf7MON6OZcm1Ru9tlFlX961xwJ3Lsmv5bXW0 1mQOv4niBAKNn+IKcUez2uXnsve+erX7mzghTioi1I3GXgluVqTN3uoaH9xh bZPxS0PgG4z9sr98HbVKqXp/NQyLb9tc7Neov6+2Vme3+ws8celuLn6LiHWD 8UuMc7zB+C0i2A3Gfg3v78yI84jfOO+hQxxMxC3ENhi/RZySDcZ+jV9VfIxO /EWqowKOucnYr5Bryv21snX/9cFt2zYYP7U4W5uLfTpfpfnm1h137g9ocMTN xS8xLoHNxbc41/01nOZJ+PTBqt02F/skuqSUuaT6/Fr3Z1cQ+zfRm8cXIK32 +ws+ccxBRYS3wfgWe68+2dfq0u/nuBAXFbHbDcY+cS+34pOiD9/v/WAL4g6o 2uX3ldVsueNbxG43GD9FxareNhh7TtfwK6/7gtv8hMk3sVIRCW0yfovhdifk l3RrWkafdY7pIaCXNhn7eb968ZH6Vm+OezUGcVIR8W0yfon4ZNo2Gb9F7Haj sfPRVd3KKrN52vdnExCViXGNbTSupVy+0dd2zde4t4ihbDR+iXE322j8FmO3 nYqYYJuNa/fLqLXlc7GNe/Zh6W6bjd9ivHNt0Yfi58ZHMqXJ/dxri5uN3yIM bTZ+izjmZuOXiEWtbTauc16zVpXqXb9bKMSdkFetLyOcWmWsbvfBIN4JtWJ+ lbX7W3TSmt1P5iH2LS6/kLw0yhpN7094cD43HDcvmvv7ex6Bt/G4n/dDnFv0 Gd+keqZl9n5PWIiLiXhIahuP3yLeufn4FoffkX161W7zfs4LUamIebsB+Skq nrrZJuRW5fJiUr/h3b9wPxmE2KgYu+1UxNTckPwS8WzINiW/RcyEjclvcSdU Nye/xQpRqNggKhUHRKPigkgTwiOeWmhCFm6RUL+K3+DEK8qb6J5ZX7/8eP14 /Xj9eP14/Xj96q9/AGe3mC8AUAAA ====