647The 11G Text-Edit Challenge

Ok, here's a challenge. Say, you have a massive, 11 Gigabyte text file. The first two lines are the header files, unfortunately the header on line 2 is slightly wrong: Instead of 'Done' it should say 'Status:Done'. (Hit: 'Done' is the first occurrence of that string in that line)

Any Ideas?

  • split and the cat? Could not figure out to split it in uneven files, i.e. the first 2 lines and the rest...
  • vi? Seems to open, but bulks when saving it.

[Update]

sed seems to be the tool for this job. sed '2 s/Done/Status:Done/' input.txt > output.txt

Took about 7 minutes on my machine... Any better ideas still very much appreciated.

[Update 2: How to insert a tab with sed]

Inserting a tab with sed turned out to be more resilient than expected. Neither an escaped tab (\t) not a double-escaped tab (\t) seemed to do the trick. On bash it is necessary to drop out of sed and print the tab (\011) directly. 27 in the following statement means of course line 27.

sed '27 s/Done/Status:Done'"$(printf '\011')"'After Tab/' in.txt > out.txt