Diff for "BashFAQ/024"

Differences between revisions 1 and 7 (spanning 6 versions)

I set variables in a loop. Why do they suddenly disappear after the loop terminates? Or, why can't I pipe data to read?

In most shells, each command of a pipeline is executed in a separate SubShell.

The following command always prints "total number of lines: 0", although the variable linecnt has a larger value in the while loop:

    # Non-working example (except in ksh88/ksh93)
    linecnt=0
    cat /etc/passwd | while read line
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"

The reason for this surprising behaviour is that a while/for/until loop runs in a SubShell when it's part of a pipeline. For the while loop above, a new subshell with its own copy of the variable linecnt is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the while loop is finished, the subshell copy is discarded, and the original variable linecnt of the parent (whose value has not changed) is used in the echo command.

Different shells behave differently when using redirection or pipes with a loop:

BourneShell creates a subshell when the input or output of a loop is redirected, either by using a pipeline or by a redirection operator ('<', '>').
BASH creates a new process only if the loop is part of a pipeline
KornShell creates it only if the loop is part of a pipeline, but not if the loop is the last part of it. (The example above actually works in ksh88 and ksh93!)
POSIX specifies the bash behaviour, but as an extension allows any or all of the parts of the pipeline to run without a subshell (thus permitting the KornShell behaviour, as well).

To solve this, either use a method that works without a subshell, or make sure you do all processing inside that subshell (a bit of a kludge, but often easier to work with):

    # POSIX
    linecnt=0
    cat /etc/passwd |
    (
        while read line ; do
                linecnt=$(($linecnt+1))
        done
        echo "total number of lines: $linecnt"
    )

To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem (at least for BASH and KornShell):

    # POSIX
    linecnt=0
    while read line ; do
        linecnt=$(($linecnt+1))
   done < /etc/passwd
   echo "total number of lines: $linecnt"

For BASH, when the input of the pipe is a command rather than a file, you can use ProcessSubstitution:

    # Bash
    while read LINE; do
        echo "-> $LINE"
    done < <(grep PATH /etc/profile)

If you're reading from a plain file, a portable and common work-around is to redirect the standard input of the script using exec:

    # Bourne
    linecnt=0
    exec < /etc/passwd    # redirect standard input from the file /etc/passwd
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"

This works as expected, and prints a line count for the file /etc/passwd. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:

    # Bourne
    exec 3<&0             # save original stdin file descriptor 0 as FD 3
    exec 0</etc/passwd    # redirect stdin from the file /etc/passwd

    linecnt=0
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done

    exec 0<&3             # restore saved stdin (FD 0) from FD 3
    exec 3<&-             # close the no-longer-needed FD 3

    echo "total number of lines: $linecnt"

Subsequent exec commands can be combined into one line, which is interpreted left-to-right:

    exec 3<&0
    exec 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3
    exec 3<&-

is equivalent to

    exec 3<&0 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3 3<&-

Another useful trick (using Bash syntax) is breaking a variable into words using read:

    # Bash
    echo "$foo" | read a b c      # this doesn't work
    read a b c <<< "$foo"         # but this does

Again, the pipeline causes the read command in the first example to run in a subshell, so its effect is never witnessed in the parent process. The second example does not create any subshells, so it works as we expect. The <<< operator is specific to bash (2.05b and later), and the input which follows it is usually called a "here string".

For more examples of how to break input into words, see FAQ #1.

-  ⇤ ← Revision 1 as of 2007-05-02 23:11:07 → 
  Size: 3986
  Editor: redondos
  Comment:
+   ← Revision 7 as of 2008-11-22 21:33:10 → ⇥
  Size: 5012
  Editor: GreyCat
  Comment: first-line
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-[[Anchor(faq24)]]
== I set variables in a loop. Why do they suddenly disappear after the loop terminates? ==
+<<Anchor(faq24)>>
== I set variables in a loop. Why do they suddenly disappear after the loop terminates? Or, why can't I pipe data to read? ==
In most shells, each command of a pipeline is executed in a separate SubShell.
-Line 7:
+Line 8:
+    # Non-working example (except in ksh88/ksh93)
-Line 15:
+Line 17:
-The reason for this surprising behaviour is that a {{{while/for/until}}} loop runs in a subshell when its input or output is redirected from a pipeline. For the {{{while}}} loop above, a new subshell with its own copy of the variable {{{linecnt}}} is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the {{{while}}} loop is finished, the subshell copy is discarded, and the original variable {{{linecnt}}} of the parent (whose value has not changed) is used in the {{{echo}}} command.
+The reason for this surprising behaviour is that a {{{while/for/until}}} loop runs in a SubShell when it's part of a pipeline. For the {{{while}}} loop above, a new subshell with its own copy of the variable {{{linecnt}}} is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the {{{while}}} loop is finished, the subshell copy is discarded, and the original variable {{{linecnt}}} of the parent (whose value has not changed) is used in the {{{echo}}} command.
-Line 17:
+Line 19:
-It's hard to tell when shell would create a new process for a loop:
 * BourneShell creates it when the input or output is redirected, either by using a pipeline or by a redirection operator ('<', '>').
 * ["BASH"] creates a new process only if the loop is part of a pipeline
 * KornShell creates it only if the loop is part of a pipeline, but ''not'' if the loop is the last part of it.
+Different shells behave differently when using redirection or pipes with a loop:
 * BourneShell creates a subshell when the input or output of a loop is redirected, either by using a pipeline or by a redirection operator ('<', '>').
 * [[BASH]] creates a new process only if the loop is part of a pipeline
 * KornShell creates it only if the loop is part of a pipeline, but ''not'' if the loop is the last part of it.  (The example above actually ''works'' in ksh88 and ksh93!)
 * POSIX specifies the bash behaviour, but as an extension allows any or all of the parts of the pipeline to run without a subshell (thus permitting the KornShell behaviour, as well).
-Line 22:
+Line 25:
-To solve this, either use a method that works without a subshell (shown below), or make sure you do all processing inside that subshell (a bit of a kludge, but easier to work with):
+To solve this, either use a method that works without a subshell, or make sure you do all processing inside that subshell (a bit of a kludge, but often easier to work with):
-Line 25:
+Line 28:
+    # POSIX
-Line 29:
+Line 33:
-                linecnt="$((linecnt+1))"
+                linecnt=$(($linecnt+1))
-Line 35:
+Line 39:
-To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem at least for ["BASH"] and KornShell (but still for BourneShell):
+To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem (at least for [[BASH]] and KornShell):
-Line 38:
+Line 42:
+    # POSIX
-Line 40:
+Line 45:
-        linecnt="$((linecnt+1))"
+        linecnt=$(($linecnt+1))
-Line 45:
+Line 50:
-For ["BASH"], when the first part of the pipe is a command, you can use "process substitution". The command used here is a simple "echo -e $'a\nb\nc'" as a substitute for a command with a multiline output:
+For [[BASH]], when the input of the pipe is a command rather than a file, you can use ProcessSubstitution:
-Line 48:
+Line 53:
+    # Bash
-Line 50:
+Line 56:
-    done < <(echo -e $'a\nb\nc')
+    done < <(grep PATH /etc/profile)
-Line 53:
+Line 59:
-A portable and common work-around is to redirect the input of the {{{read}}} command using {{{exec}}}:
+If you're reading from a plain file, a portable and common work-around is to redirect the standard input of the script using {{{exec}}}:
-Line 56:
+Line 62:
+    # Bourne
-Line 65:
+Line 72:
-This works as expected, and prints a line count for the file /etc/passwd. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:
+This works as expected, and prints a line count for the file {{{/etc/passwd}}}. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:
-Line 68:
+Line 75:
-    exec 3<&0    # save original standard input file descriptor "0" as FD "3"
    exec 0</etc/passwd    # redirect standard input from the file /etc/passwd
+    # Bourne
    exec 3<&0             # save original stdin file descriptor 0 as FD 3
    exec 0</etc/passwd    # redirect stdin from the file /etc/passwd
-Line 77:
+Line 85:
-    exec 0<&3   # restore saved standard input (fd 0) from file descriptor "3"
    exec 3<&-   # close the no longer needed file descriptor "3"
+    exec 0<&3             # restore saved stdin (FD 0) from FD 3
    exec 3<&-             # close the no-longer-needed FD 3
-Line 100:
+Line 108:
+Another useful trick (using Bash syntax) is breaking a variable into words using {{{read}}}:

{{{
    # Bash
    echo "$foo" | read a b c      # this doesn't work
    read a b c <<< "$foo"         # but this does
}}}

Again, the pipeline causes the {{{read}}} command in the first example to run in a subshell, so its effect is never witnessed in the parent process.  The second example does not create any subshells, so it works as we expect.  The {{{<<<}}} operator is specific to bash (2.05b and later), and the input which follows it is usually called a "here string".

For more examples of how to break input into words, see [[BashFAQ/001|FAQ #1]].