Differences between revisions 6 and 26 (spanning 20 versions)
Revision 6 as of 2008-05-15 18:04:54
Size: 2985
Editor: GreyCat
Comment:
Revision 26 as of 2019-08-21 16:24:29
Size: 7164
Editor: GreyCat
Comment: Add subheaders, move things around a bit.
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
[[Anchor(faq18)]] <<Anchor(faq18)>>
Line 3: Line 3:
As always, there are different ways to solve the problem, each with its own advantages and disadvantages. As always, there are many different ways to solve the problem, each with its own advantages and disadvantages.  The most important considerations are which shell you're using, whether the start/end numbers are constants, and how many times the loop is going to iterate.
Line 5: Line 5:
If there are not many numbers, BraceExpansion can be used:
{{{
    # Bash
    for i in 0{1,2,3,4,5,6,7,8,9} 10
    do
        echo $i
    done
=== Brace expansion ===
If you're in bash/zsh/ksh, and if the start and end numbers are constants, and if there aren't too many of them, you can use BraceExpansion. Bash version 4 allows zero-padding and ranges in its brace expansion:

{{{#!highlight bash
# Bash 4 / zsh
for i in {01..10}; do
    ...
Line 14: Line 14:
Output:
{{{
   00
   01
   02
   03
   [...]
In Bash 3, you can use ranges inside brace expansion (but not zero-padding). Thus, the same thing can be accomplished more concisely like this:

{{{#!highlight bash
# Bash 3
for i in 0{1..9} 10
do
    ...
Line 23: Line 23:
This gets tedious for large sequences, but there are other ways, too. If you have the {{{printf}}} command (which is a Bash builtin, and is also POSIX standard), it can be used to format a number:
{{{
    # Bash
    for ((i=1; i<=10; i++))
    do
        printf "%02d " "$i"
    done
Another bash 3 example, for output of 0000 to 0034:

{{{#!highlight bash
# Bash 3
for i in {000{0..9},00{10..34}}
do
    echo "$i"
done

# using the outer brace instead of just adding them one next to the other
# allows to use the expansion, for instance, like this:
wget 'http://foo.com/adir/thepages'{000{0..9},00{10..34}}'.html'
Line 32: Line 37:
In Bash 3, you can use ranges inside brace expansion. Also, since {{{printf}}} will implicitly loop if given more arguments than format specifiers, you can simplify this enormously:
{{{
   # Bash 3
   printf "%03d\n" {1..300}
In ksh and in older bash versions, where the leading zeroes are not supported directly by brace expansion, you might still be able to approximate it:

{{{#!highlight bash
# Bash / ksh / zsh
for i in 0{1,2,3,4,5,6,7,8,9} 10
do
    ...
Line 38: Line 46:
The KornShell has the {{{typeset}}} command to specify the number of leading zeros:
{{{
    # Korn
    $ typeset -Z3 i=4
    $ echo $i
    004
=== Formatting with printf ===
The most important drawback with BraceExpansion is that the whole list of numbers is generated and held in memory all at once. If there are only a few thousand numbers, that may not be so bad, but if you're looping millions of times, you would need a ''lot'' of memory to hold the fully expanded list of numbers.

The `printf` command (which is a Bash builtin, and is also POSIX standard), can be used to format a number, including zero-padding. The bash builtin can also assign the formatted result to a shell variable (in recent versions), without forking a SubShell.

If all you want to do is print the sequence of numbers, and you're in bash/ksh/zsh, and the sequence is fairly small, you can use the implicit looping feature of `printf` together with a brace expansion:

{{{#!highlight bash
# Bash 3
printf '%03d\n' {1..300}
Line 46: Line 58:
If you're in bash 3.1 or higher, you can use a C-style `for` loop together with `printf -v` to format the numbers into a variable:

{{{#!highlight bash
# Bash 3.1 / ksh93 / zsh
for ((i = 1; i <= 10; i++)); do
    printf -v ii %02d "$i"
    echo "$ii"
done
}}}

Brace expansion requires constant starting and ending values. If you don't know in advance what the start and end values are, you can cheat:

{{{#!highlight bash
# Bash 3
# start and end are variables containing integers
eval "printf '%03d\n' {$start..$end}"
}}}

The `eval` is required in Bash because brace expansions occur ''before'' parameter expansions.

The traditional Csh implementation, which all other applicable shells follow, insert the brace expansion pass sometime between the processing of other expansions and pathname expansion, thus parameter expansion has already been performed by the time words are scanned for brace expansion. There are various pros and cons to Bash's implementation, this being probably the most frequently cited drawback. Given how messy that `eval` solution is, please give serious thought to using a `for` or `while` loop with shell arithmetic instead.

=== Ksh formatted brace expansion ===
The ksh93 method for specifying field width for sequence expansion is to add a (limited) `printf` format string to the syntax, which is used to format each expanded word. This is somewhat more powerful, but unfortunately incompatible with bash, and ksh does not understand Bash's field padding scheme:

{{{#!highlight bash
#ksh93
echo {0..10..2%02d}
}}}

ksh93 also has a variable attribute that specifies a field with to pad with leading zeros whenever the variable is referenced. The concept is similar to other attributes supported by Bash such as case modification. Note that ksh never interprets octal literals.

{{{#!highlight bash
# ksh93 / mksh / zsh
$ typeset -Z3 i=4
$ echo $i
004
}}}

=== External programs ===
Line 47: Line 100:
{{{
    seq -w 1 10
{{{#!highlight bash
seq -w 1 10
Line 52: Line 105:
{{{
    seq -f "%03g" 1 10
{{{#!highlight bash
seq -f "%03g" 1 10
Line 57: Line 110:
{{{
   # POSIX
   printf "%03d\n" $(seq 300)
{{{#!highlight bash
# POSIX shell, GNU utilities
printf '%03d\n' $(seq 300)
Line 62: Line 115:
(That may be helpful if your version of {{{seq(1)}}} lacks {{{printf}}}-style format specifiers. Since it's a nonstandard external tool, it's good to keep your options open.) (That may be helpful if you are not using Bash, but you have `seq(1)`, and your version of {{{seq(1)}}} lacks {{{printf}}}-style format specifiers. That's a pretty odd set of restrictions, but I suppose it's theoretically possible. Since `seq` is a nonstandard external tool, it's good to keep your options open.)
Line 64: Line 117:
Be warned however that seq might be considered bad style, it's even mentioned in ["Don't Ever Do These"]. Be warned however that using `seq` might be considered bad style; it's even mentioned in [[BashGuide/Practices#Don.27t_Ever_Do_These|Don't Ever Do These]].
Line 66: Line 119:
Finally, the following example works with any BourneShell derived shell to zero-pad each line to three bytes:
{{{
   # Bourne
   i=0
   while test $i -le 10
   do
       echo "00$i"
       i=`expr $i + 1`
   done |
       sed 's/.*\(...\)$/\1/g'
Some BSD-derived systems have `jot(1)` instead of `seq(1)`. In accordance with the glorious tradition of Unix, it has a completely incompatible syntax:
{{{#!highlight bash
# POSIX shell, OpenBSD et al.
printf "%02d\n" $(jot 10 1)

# Bourne shell, OpenBSD (at least)
jot -w %02d 10 1
}}}

Finally, the following example works with any BourneShell derived shell (which also has `expr` and `sed`) to zero-pad each line to three bytes:
{{{#!highlight bash
# Bourne
i=0
while test $i -le 10
do
    echo "00$i"
    i=`expr $i + 1`
done |
    sed 's/.*\(...\)$/\1/g'
Line 80: Line 142:
Now, since the number one reason this question is asked is for downloading images in bulk, you can use the {{{printf}}} command with {{{xargs(1)}}} and {{{wget(1)}}} to fetch files: But if you're going to rely on an external Unix command, you might as well just do the whole thing in `awk` in the first place:
{{{#!highlight bash
# Bourne
# count variable contains an integer
awk -v count="$count" 'BEGIN {for (i=1;i<=count;i++) {printf("%03d\n",i)} }'
Line 82: Line 148:
{{{
   # Bash
   # START and END are variables containing integers
   eval printf '"%03d\n"' {$START..$END} | xargs -i% wget $LOCATION/%
# Bourne, with Solaris's decrepit and useless awk:
awk "BEGIN {for (i=1;i<=$count;i++) {printf(\"%03d\\n\",i)} }"
Line 88: Line 152:
The `eval` is needed here because you cannot have variables in a brace expansion -- only constants. The extra quotes are required by the `eval` so that our `\n` isn't changed to an `n`. ----
Line 90: Line 154:
A slightly more general case:
{{{
   # Bash
   for i in {1..100}; do
      wget "$prefix$(printf %03d $i).jpg"
      # other commands
   done
Now, since the number one reason this question is asked is for downloading images in bulk, you can use the examples above with {{{xargs(1)}}} and {{{wget(1)}}} to fetch files:
{{{#!highlight bash
almost any example above | xargs -i% wget $LOCATION/%
Line 99: Line 159:
Personally, GreyCat likes the `for` loop version much better than the `eval`/`xargs` version. The `xargs -i%` will read a line of input at a time, and replace the `%` at the end of the command with the input.

Or, a simpler example using a `for` loop:
{{{#!highlight bash
# Bash 3
for i in {1..100}; do
   wget "$prefix$(printf %03d $i).jpg"
   sleep 5
done
}}}

Or, avoiding the subshells (requires bash 3.1):
{{{#!highlight bash
# Bash 3.1
for i in {1..100}; do
   printf -v n %03d $i
   wget "$prefix$n.jpg"
   sleep 5
done
}}}

----
CategoryShell

How can I use numbers with leading zeros in a loop, e.g. 01, 02?

As always, there are many different ways to solve the problem, each with its own advantages and disadvantages. The most important considerations are which shell you're using, whether the start/end numbers are constants, and how many times the loop is going to iterate.

Brace expansion

If you're in bash/zsh/ksh, and if the start and end numbers are constants, and if there aren't too many of them, you can use BraceExpansion. Bash version 4 allows zero-padding and ranges in its brace expansion:

   1 # Bash 4 / zsh
   2 for i in {01..10}; do
   3     ...

In Bash 3, you can use ranges inside brace expansion (but not zero-padding). Thus, the same thing can be accomplished more concisely like this:

   1 # Bash 3
   2 for i in 0{1..9} 10
   3 do
   4     ...

Another bash 3 example, for output of 0000 to 0034:

   1 # Bash 3
   2 for i in {000{0..9},00{10..34}}
   3 do
   4     echo "$i"
   5 done
   6 
   7 # using the outer brace instead of just adding them one next to the other
   8 # allows to use the expansion, for instance, like this:
   9 wget 'http://foo.com/adir/thepages'{000{0..9},00{10..34}}'.html'

In ksh and in older bash versions, where the leading zeroes are not supported directly by brace expansion, you might still be able to approximate it:

   1 # Bash / ksh / zsh
   2 for i in 0{1,2,3,4,5,6,7,8,9} 10
   3 do
   4     ...

Formatting with printf

The most important drawback with BraceExpansion is that the whole list of numbers is generated and held in memory all at once. If there are only a few thousand numbers, that may not be so bad, but if you're looping millions of times, you would need a lot of memory to hold the fully expanded list of numbers.

The printf command (which is a Bash builtin, and is also POSIX standard), can be used to format a number, including zero-padding. The bash builtin can also assign the formatted result to a shell variable (in recent versions), without forking a SubShell.

If all you want to do is print the sequence of numbers, and you're in bash/ksh/zsh, and the sequence is fairly small, you can use the implicit looping feature of printf together with a brace expansion:

   1 # Bash 3
   2 printf '%03d\n' {1..300}

If you're in bash 3.1 or higher, you can use a C-style for loop together with printf -v to format the numbers into a variable:

   1 # Bash 3.1 / ksh93 / zsh
   2 for ((i = 1; i <= 10; i++)); do
   3     printf -v ii %02d "$i"
   4     echo "$ii"
   5 done

Brace expansion requires constant starting and ending values. If you don't know in advance what the start and end values are, you can cheat:

   1 # Bash 3
   2 # start and end are variables containing integers
   3 eval "printf '%03d\n' {$start..$end}"

The eval is required in Bash because brace expansions occur before parameter expansions.

The traditional Csh implementation, which all other applicable shells follow, insert the brace expansion pass sometime between the processing of other expansions and pathname expansion, thus parameter expansion has already been performed by the time words are scanned for brace expansion. There are various pros and cons to Bash's implementation, this being probably the most frequently cited drawback. Given how messy that eval solution is, please give serious thought to using a for or while loop with shell arithmetic instead.

Ksh formatted brace expansion

The ksh93 method for specifying field width for sequence expansion is to add a (limited) printf format string to the syntax, which is used to format each expanded word. This is somewhat more powerful, but unfortunately incompatible with bash, and ksh does not understand Bash's field padding scheme:

   1 #ksh93
   2 echo {0..10..2%02d}

ksh93 also has a variable attribute that specifies a field with to pad with leading zeros whenever the variable is referenced. The concept is similar to other attributes supported by Bash such as case modification. Note that ksh never interprets octal literals.

   1 # ksh93 / mksh / zsh
   2 $ typeset -Z3 i=4
   3 $ echo $i
   4 004

External programs

If the command seq(1) is available (it's part of GNU sh-utils/coreutils), you can use it as follows:

   1 seq -w 1 10

or, for arbitrary numbers of leading zeros (here: 3):

   1 seq -f "%03g" 1 10

Combining printf with seq(1), you can do things like this:

   1 # POSIX shell, GNU utilities
   2 printf '%03d\n' $(seq 300)

(That may be helpful if you are not using Bash, but you have seq(1), and your version of seq(1) lacks printf-style format specifiers. That's a pretty odd set of restrictions, but I suppose it's theoretically possible. Since seq is a nonstandard external tool, it's good to keep your options open.)

Be warned however that using seq might be considered bad style; it's even mentioned in Don't Ever Do These.

Some BSD-derived systems have jot(1) instead of seq(1). In accordance with the glorious tradition of Unix, it has a completely incompatible syntax:

   1 # POSIX shell, OpenBSD et al.
   2 printf "%02d\n" $(jot 10 1)
   3 
   4 # Bourne shell, OpenBSD (at least)
   5 jot -w %02d 10 1

Finally, the following example works with any BourneShell derived shell (which also has expr and sed) to zero-pad each line to three bytes:

   1 # Bourne
   2 i=0
   3 while test $i -le 10
   4 do
   5     echo "00$i"
   6     i=`expr $i + 1`
   7 done |
   8     sed 's/.*\(...\)$/\1/g'

In this example, the number of '.' inside the parentheses in the sed command determines how many total bytes from the echo command (at the end of each line) will be kept and printed.

But if you're going to rely on an external Unix command, you might as well just do the whole thing in awk in the first place:

   1 # Bourne
   2 # count variable contains an integer
   3 awk -v count="$count" 'BEGIN {for (i=1;i<=count;i++) {printf("%03d\n",i)} }'
   4 
   5 # Bourne, with Solaris's decrepit and useless awk:
   6 awk "BEGIN {for (i=1;i<=$count;i++) {printf(\"%03d\\n\",i)} }"


Now, since the number one reason this question is asked is for downloading images in bulk, you can use the examples above with xargs(1) and wget(1) to fetch files:

   1 almost any example above | xargs -i% wget $LOCATION/%

The xargs -i% will read a line of input at a time, and replace the % at the end of the command with the input.

Or, a simpler example using a for loop:

   1 # Bash 3
   2 for i in {1..100}; do
   3    wget "$prefix$(printf %03d $i).jpg"
   4    sleep 5
   5 done

Or, avoiding the subshells (requires bash 3.1):

   1 # Bash 3.1
   2 for i in {1..100}; do
   3    printf -v n %03d $i
   4    wget "$prefix$n.jpg"
   5    sleep 5
   6 done


CategoryShell

BashFAQ/018 (last edited 2019-08-21 16:24:29 by GreyCat)