Size: 1833
Comment:
|
Size: 1735
Comment: spam
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
[[Anchor(faq19)]] | <<Anchor(faq19)>> |
Line 11: | Line 11: |
sed -n '1,10p' }}} This stops {{{sed}}} from printing each line ({{{-n}}}). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). {{{sed}}} still reads the input until the end, although we are only interested in lines 1 though 10. We can speed this up by making {{{sed}}} terminate immediately after printing line 10: {{{ |
|
Line 20: | Line 14: |
Now the command will quit after reading line 10 ("10q"). The {{{-e}}} arguments indicate a script (instead of a file name). The same can be written a little shorter: {{{ sed -n '1,10p;10q' }}} |
This stops {{{sed}}} from printing each line ({{{-n}}}). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). The command will quit after reading line 10 ("10q"). |
Line 29: | Line 19: |
# POSIX shell | |
Line 31: | Line 22: |
firstline=1 maxlines=$(wc -l < "$file") # count number of lines while (($firstline < $maxlines)) |
cur=1 last=$(wc -l < "$file") # count number of lines chunk=1 while [ $cur -lt $last ] |
Line 35: | Line 27: |
((lastline=$firstline+$range+1)) sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file" ((firstline=$firstline+$range+1)) |
endofchunk=$(($cur + $range - 1)) sed -n -e "$cur,${endofchunk}p" -e "${endofchunk}q" "$file" > chunk.$(printf %04d $chunk) chunk=$(($chunk + 1)) cur=$(($cur + $range)) |
Line 41: | Line 34: |
This example uses ["BASH"] and KornShell ArithmeticExpressions, which older [wiki:Self:BourneShell Bourne shells] do not have. In that case the following example should be used instead: | The previous example uses POSIX [[ArithmeticExpression|arithmetic]], which older [[BourneShell|Bourne shells]] do not have. In that case the following example should be used instead: |
Line 44: | Line 37: |
# legacy Bourne shell; assume no printf either | |
Line 46: | Line 40: |
firstline=1 maxlines=`wc -l < "$file"` # count line numbers while [ $firstline -le $maxlines ] |
cur=1 last=`wc -l < "$file"` # count number of lines chunk=1 while test $cur -lt $last |
Line 50: | Line 45: |
lastline=`expr $firstline + $range + 1` sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file" firstline=`expr $lastline + 1` |
endofchunk=`expr $cur + $range - 1` sed -n -e "$cur,${endofchunk}p" -e "${endofchunk}q" "$file" > chunk.$chunk chunk=`expr $chunk + 1` cur=`expr $cur + $range` |
Line 55: | Line 51: |
Awk can also be used to produce a more or less equivalent result: {{{ awk -v range=10 '{print > FILENAME "." (int((NR -1)/ range)+1)}' file }}} |
How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30?
Some Unix systems provide the split utility for this purpose:
split --lines 10 --numeric-suffixes input.txt output-
For more flexibility you can use sed. The sed command can print e.g. the line number range 1-10:
sed -n -e '1,10p' -e '10q'
This stops sed from printing each line (-n). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). The command will quit after reading line 10 ("10q").
We can now use this to print an arbitrary range of a file (specified by line number):
# POSIX shell file=/etc/passwd range=10 cur=1 last=$(wc -l < "$file") # count number of lines chunk=1 while [ $cur -lt $last ] do endofchunk=$(($cur + $range - 1)) sed -n -e "$cur,${endofchunk}p" -e "${endofchunk}q" "$file" > chunk.$(printf %04d $chunk) chunk=$(($chunk + 1)) cur=$(($cur + $range)) done
The previous example uses POSIX arithmetic, which older Bourne shells do not have. In that case the following example should be used instead:
# legacy Bourne shell; assume no printf either file=/etc/passwd range=10 cur=1 last=`wc -l < "$file"` # count number of lines chunk=1 while test $cur -lt $last do endofchunk=`expr $cur + $range - 1` sed -n -e "$cur,${endofchunk}p" -e "${endofchunk}q" "$file" > chunk.$chunk chunk=`expr $chunk + 1` cur=`expr $cur + $range` done
Awk can also be used to produce a more or less equivalent result:
awk -v range=10 '{print > FILENAME "." (int((NR -1)/ range)+1)}' file