Differences between revisions 7 and 8
Revision 7 as of 2008-11-22 21:33:10
Size: 5012
Editor: GreyCat
Comment: first-line
Revision 8 as of 2009-03-24 08:42:31
Size: 4064
Editor: pgas
Comment: major changes, delegates the bourne shell workarounds, add named pipes, coprocess and here documents
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:

=== The problem ===
Line 4: Line 7:

The following command always prints "total number of lines: 0", although the variable {{{linecnt}}} has a larger value in the {{{while}}} loop:
Line 10: Line 11:
    cat /etc/passwd | while read line     printf "%s\n" foo bar | while read -r line
Line 12: Line 13:
        linecnt=`expr $linecnt + 1`         linecnt=$((linecnt+1))
Line 14: Line 15:
    echo "total number of lines: $linecnt"     echo "total number of lines: $linecnt" # prints 0

    # the problem also occurs without a loop
    var=0
    echo 2 | read -r var
    echo $var # also prints 0
Line 20: Line 26:
 * BourneShell creates a subshell when the input or output of a loop is redirected, either by using a pipeline or by a redirection operator ('<', '>').  * BourneShell creates a subshell when the input or output of anything (loops, case etc..) but a simple command is redirected, either by using a pipeline or by a redirection operator ('<', '>').
Line 25: Line 31:
To solve this, either use a method that works without a subshell, or make sure you do all processing inside that subshell (a bit of a kludge, but often easier to work with): === Workarounds ===
  
Several possibilities to avoid the subshell exists:
Line 27: Line 35:
{{{ * If the input is a file, remove the '''useless use of cat'''
  
 {{{
  #POSIX
  while read -r line;do linecnt=$((linecnt+1));done < file
  echo $linecnt
 }}}

 Unfortunately this does't work with a Bourne shell see [[http://heirloom.sourceforge.net/sh/sh.1.html#20|sh(1) from the Heirloom Bourne Shell]] for a workaround.

* '''Group the commands''' and do it all in the subshell

 {{{
Line 31: Line 51:
    (
        while read line ; do
                linecnt=$(($linecnt+1))
    {
        while read -r line ; do
           linecnt=$((linecnt+1))
Line 36: Line 56:
    )
}}}
    }
 }}}
Line 39: Line 59:
To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem (at least for [[BASH]] and KornShell): * Use '''process substitution''' (BASH only)
Line 41: Line 61:
{{{
    # POSIX
    linecnt=0
    while read line ; do
        linecnt=$(($linecnt+1))
   done < /etc/passwd
 {{{
    # Bash
    while read -r line; do
       linecnt=$((linecnt+1))
    done < <(grep PATH /etc/profile)
    echo "total number of lines: $linecnt"
 }}}

 See also [[BashFAQ/001|FAQ #1]]

* Use a '''named pipe''' (POSIX)

  {{{
   # POSIX
   mkfifo mypipe
   grep PATH /etc/profile > mypipe &
   while read -r line;do
       linecnt=$((linecnt+1))
   done < mypipe
Line 48: Line 81:
}}}  }}}
Line 50: Line 83:
For [[BASH]], when the input of the pipe is a command rather than a file, you can use ProcessSubstitution:  For more information see NamedPipes
Line 52: Line 85:
{{{ * Use a '''coprocess''' (ksh, even pdksh, oksh, mksh..)
 {{{
  #ksh
  grep PATH /etc/profile |&
  while read -r -p line; do
    linecnt=$((linecnt+1))
  done
  echo "total number of lines: $linecnt"
 }}}

* Another useful trick (using Bash/ksh93 syntax) is breaking a variable into words using {{{read}}}:

 {{{
Line 54: Line 99:
    while read LINE; do
        echo "-> $LINE"
    done < <(grep PATH /etc/profile)
}}}
    echo "$foo" | read -r a b c # this doesn't work
    read -r a b c <<< "$foo" # but this does
 }}}
Line 59: Line 103:
If you're reading from a plain file, a portable and common work-around is to redirect the standard input of the script using {{{exec}}}:  Again, the pipeline causes the {{{read}}} command in the first example to run in a subshell, so its effect is never witnessed in the parent process. The second example does not create any subshells, so it works as we expect. The {{{<<<}}} operator is specific to bash (2.05b and later), and the input which follows it is usually called a "here string".
Line 61: Line 105:
{{{
    # Bourne
    linecnt=0
    exec < /etc/passwd # redirect standard input from the file /etc/passwd
    while read line # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"
}}}
 For more examples of how to break input into words, see [[BashFAQ/001|FAQ #1]].
Line 72: Line 107:
This works as expected, and prints a line count for the file {{{/etc/passwd}}}. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:

{{{
    # Bourne
    exec 3<&0 # save original stdin file descriptor 0 as FD 3
    exec 0</etc/passwd # redirect stdin from the file /etc/passwd

    linecnt=0
    while read line # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done

    exec 0<&3 # restore saved stdin (FD 0) from FD 3
    exec 3<&- # close the no-longer-needed FD 3

    echo "total number of lines: $linecnt"
}}}

Subsequent {{{exec}}} commands can be combined into one line, which is interpreted left-to-right:

{{{
    exec 3<&0
    exec 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3
    exec 3<&-
}}}

is equivalent to

{{{
    exec 3<&0 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3 3<&-
}}}

Another useful trick (using Bash syntax) is breaking a variable into words using {{{read}}}:

{{{
    # Bash
    echo "$foo" | read a b c # this doesn't work
    read a b c <<< "$foo" # but this does
}}}

Again, the pipeline causes the {{{read}}} command in the first example to run in a subshell, so its effect is never witnessed in the parent process. The second example does not create any subshells, so it works as we expect. The {{{<<<}}} operator is specific to bash (2.05b and later), and the input which follows it is usually called a "here string".

For more examples of how to break input into words, see [[BashFAQ/001|FAQ #1]].
 With a Posix shell you can use a here document instead:
 {{{
   #Posix
   read -r a b c << EOF
   $foo
   EOF
 }}}

I set variables in a loop. Why do they suddenly disappear after the loop terminates? Or, why can't I pipe data to read?

The problem

In most shells, each command of a pipeline is executed in a separate SubShell.

    # Non-working example (except in ksh88/ksh93)
    linecnt=0
    printf "%s\n" foo bar  | while read -r line
    do
        linecnt=$((linecnt+1))
    done
    echo "total number of lines: $linecnt" # prints 0

    # the problem also occurs without a loop
    var=0
    echo 2 | read -r var
    echo $var # also prints 0

The reason for this surprising behaviour is that a while/for/until loop runs in a SubShell when it's part of a pipeline. For the while loop above, a new subshell with its own copy of the variable linecnt is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the while loop is finished, the subshell copy is discarded, and the original variable linecnt of the parent (whose value has not changed) is used in the echo command.

Different shells behave differently when using redirection or pipes with a loop:

  • BourneShell creates a subshell when the input or output of anything (loops, case etc..) but a simple command is redirected, either by using a pipeline or by a redirection operator ('<', '>').

  • BASH creates a new process only if the loop is part of a pipeline

  • KornShell creates it only if the loop is part of a pipeline, but not if the loop is the last part of it. (The example above actually works in ksh88 and ksh93!)

  • POSIX specifies the bash behaviour, but as an extension allows any or all of the parts of the pipeline to run without a subshell (thus permitting the KornShell behaviour, as well).

Workarounds

Several possibilities to avoid the subshell exists:

* If the input is a file, remove the useless use of cat

  •   #POSIX
      while read -r line;do  linecnt=$((linecnt+1));done < file
      echo $linecnt

    Unfortunately this does't work with a Bourne shell see sh(1) from the Heirloom Bourne Shell for a workaround.

* Group the commands and do it all in the subshell

  •     # POSIX
        linecnt=0
        cat /etc/passwd |
        {
            while read -r line ; do
               linecnt=$((linecnt+1))
            done
            echo "total number of lines: $linecnt"
        }

* Use process substitution (BASH only)

  •     # Bash
        while read -r line; do
           linecnt=$((linecnt+1))
        done < <(grep PATH /etc/profile)
        echo "total number of lines: $linecnt"

    See also FAQ #1

* Use a named pipe (POSIX)

  •    # POSIX
       mkfifo mypipe
       grep PATH /etc/profile > mypipe &
       while read -r line;do
           linecnt=$((linecnt+1))
       done < mypipe
       echo "total number of lines: $linecnt"

* Use a coprocess (ksh, even pdksh, oksh, mksh..)

  •   #ksh
      grep PATH /etc/profile |&
      while read -r -p line; do
        linecnt=$((linecnt+1))
      done
      echo "total number of lines: $linecnt"

* Another useful trick (using Bash/ksh93 syntax) is breaking a variable into words using read:

  •     # Bash
        echo "$foo" | read -r a b c      # this doesn't work
        read -r a b c <<< "$foo"         # but this does

    Again, the pipeline causes the read command in the first example to run in a subshell, so its effect is never witnessed in the parent process. The second example does not create any subshells, so it works as we expect. The <<< operator is specific to bash (2.05b and later), and the input which follows it is usually called a "here string".

    For more examples of how to break input into words, see FAQ #1. With a Posix shell you can use a here document instead:

       #Posix
       read -r a b c << EOF
       $foo
       EOF

BashFAQ/024 (last edited 2023-12-12 13:15:33 by 195)