Differences between revisions 8 and 37 (spanning 29 versions)
Revision 8 as of 2009-03-24 08:42:31
Size: 4064
Editor: pgas
Comment: major changes, delegates the bourne shell workarounds, add named pipes, coprocess and here documents
Revision 37 as of 2021-06-19 13:48:59
Size: 6954
Editor: 178235191244
Comment: Forgot to remote -p from bash coproc version vs ksh....
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
== I set variables in a loop. Why do they suddenly disappear after the loop terminates? Or, why can't I pipe data to read? ==

=== The problem ===

In most shells, each command of a pipeline is executed in a separate SubShell.
== I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read? ==
In most shells, each command of a pipeline is executed in a separate SubShell.  Non-working example:
Line 9: Line 6:
    # Non-working example (except in ksh88/ksh93)
    linecnt=0
    printf "%s\n" foo bar | while read -r line
    # Works only in ksh88/ksh93, or zsh or bash 4.2 with lastpipe enabled
    # In other shells, this will print 0

    linecount=0

    printf '%s\n' foo bar |
   while IFS= read -r line
Line 13: Line 13:
        linecnt=$((linecnt+1))         linecount=$((linecount + 1))
Line 15: Line 15:
    echo "total number of lines: $linecnt" # prints 0
Line 17: Line 16:
    # the problem also occurs without a loop
    var=0
    echo 2 | read -r var
    echo $var # also prints 0
    echo "total number of lines: $linecount"
Line 23: Line 19:
The reason for this surprising behaviour is that a {{{while/for/until}}} loop runs in a SubShell when it's part of a pipeline. For the {{{while}}} loop above, a new subshell with its own copy of the variable {{{linecnt}}} is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the {{{while}}} loop is finished, the subshell copy is discarded, and the original variable {{{linecnt}}} of the parent (whose value has not changed) is used in the {{{echo}}} command. The reason for this potentially surprising behaviour, as described above, is that each SubShell introduces a new variable context and environment. The {{{while}}} loop above is executed in a new subshell with its own copy of the variable {{{linecount}}} created with the initial value of '0' taken from the parent shell. This copy then is used for counting. When the {{{while}}} loop is finished, the subshell copy is discarded, and the original variable {{{linecount}}} of the parent (whose value hasn't changed) is used in the {{{echo}}} command.
Line 25: Line 21:
Different shells behave differently when using redirection or pipes with a loop: Different shells exhibit different behaviors in this situation:
Line 27: Line 23:
 * [[BASH]] creates a new process only if the loop is part of a pipeline
 * KornShell creates it only if the loop is part of a pipeline, but ''not'' if the loop is the last part of it.  (The example above actually ''works'' in ksh88 and ksh93!)
 * [[BASH]], Yash and PDKsh-derived shells create a new process only if the loop is part of a pipeline.
 * KornShell and Zsh creates it only if the loop is part of a pipeline, but ''not'' if the loop is the last part of it. The read example above actually ''works'' in ksh88, ksh93, zsh! (but not MKsh or other PDKsh-derived shells)
Line 31: Line 27:
More broken stuff:
{{{
    # Bash 4
    # The problem also occurs without a loop
    printf '%s\n' foo bar | mapfile -t line
    printf 'total number of lines: %s\n' "${#line[@]}" # prints 0
}}}

{{{
    f() {
        if [[ -t 0 ]]; then
            echo "$1"
        else
            read -r var
        fi
    };

    f 'hello' | f
    echo "$var" # prints nothing
}}}

Again, in both cases the pipeline causes {{{read}}} or some containing command to run in a subshell, so its effect is never witnessed in the parent process.

 It should be stressed that this issue isn't specific to loops. It's a general property of all pipes, though the "{{{while/read}}}" loop might be considered the canonical example that crops up over and over when people read the help or manpage description of the {{{read}}} builtin and notice that it accepts data on stdin. They might recall that data redirected into a compound command is available throughout that command, but not understand why all the fancy process substitutions and redirects they run across in places like [[BashFAQ/001|FAQ #1]] are necessary. Naturally they proceed to put their funstuff directly into a pipeline, and confusion ensues.
Line 32: Line 53:
   Several possibilities to avoid the subshell exists:
Line 35: Line 54:
* If the input is a file, remove the '''useless use of cat'''
  
 {{{
  #POSIX
  while read -r line;do linecnt=$((linecnt+1));done < file
  echo $linecnt
 }}}

 Unfortunately this does't work with a Bourne shell see [[http://heirloom.sourceforge.net/sh/sh.1.html#20|sh(1) from the Heirloom Bourne Shell]] for a workaround.

* '''Group the commands''' and do it all in the subshell
 * If the input is a file, a simple redirect will suffice:
Line 49: Line 58:
    linecnt=0     while IFS= read -r line; do linecount=$((linecount + 1)); done < file
    echo "$linecount"
 }}}

 Unfortunately, this doesn't work with a Bourne shell; see [[http://heirloom.sourceforge.net/sh/sh.1.html#20|sh(1) from the Heirloom Bourne Shell]] for a workaround.

 * Use [[BashGuide/CompoundCommands#Command_grouping|command grouping]] and do everything in the subshell:

 {{{
    # POSIX
    linecount=0
Line 52: Line 72:
        while read -r line ; do
        linecnt=$((linecnt+1))
        done
        echo "total number of lines: $linecnt"
 while IFS= read -r line
 
do
  linecount=$((linecount + 1))
 done

 
echo "total number of lines: $linecount" ;
Line 59: Line 81:
* Use '''process substitution''' (BASH only)  This doesn't really change the subshell situation, but if nothing from the subshell is needed in the rest of your code then destroying the local environment after you're through with it could be just what you want anyway.

 * Use ProcessSubstitution (Bash/Zsh/Ksh93 only):
Line 62: Line 86:
    # Bash
    while read -r line; do
       linecnt=$((linecnt+1))
    # Bash/Ksh93/Zsh
    while IFS= read -r line
   
do
        ((linecount++))
Line 66: Line 91:
    echo "total number of lines: $linecnt"
echo "total number of lines: $linecount"
Line 69: Line 95:
 See also [[BashFAQ/001|FAQ #1]]  This is essentially identical to the first workaround above. We still redirect a file, only this time the file happens to be a named pipe temporarily created by our process substitution to transport the output of grep.
Line 71: Line 97:
* Use a '''named pipe''' (POSIX)  * Use a [[NamedPipes|named pipe]]:
Line 73: Line 99:
  {{{
   # POSIX
   mkfifo mypipe
   grep PATH /etc/profile > mypipe &
   while read -r line;do
       linecnt=$((linecnt+1))
   done < mypipe
   echo "total number of lines: $linecnt"
 {{{
  # POSIX
    mkfifo mypipe
  grep PATH /etc/profile > mypipe &

 
while IFS= read -r line
    
do
        linecount=$((linecount + 1))
    done < mypipe

 
echo "total number of lines: $linecount"
Line 83: Line 112:
 For more information see NamedPipes  * Use a [[http://wiki.bash-hackers.org/syntax/keywords/coproc|coprocess]] (ksh, even pdksh, oksh, mksh..):
Line 85: Line 114:
* Use a '''coprocess''' (ksh, even pdksh, oksh, mksh..)
Line 87: Line 115:
  #ksh
  grep PATH /etc/profile |&
  while read -r -p line; do
    linecnt=$((linecnt+1))
  done
  echo "total number of lines: $linecnt"
   # ksh
   grep PATH /etc/profile |&

  
while IFS= read -r -p line
   
do
     linecount=$((linecount + 1))
   done

  
echo "total number of lines: $linecount"
Line 95: Line 126:
* Another useful trick (using Bash/ksh93 syntax) is breaking a variable into words using {{{read}}}:  {{{
    # bash>4
    coproc grep PATH /etc/profile

    while IFS= read -r line
    do
        linecount=$((linecount + 1))
    done <&${COPROC[0]}

    echo "total number of lines: $linecount"
 }}}

 * Use a HereString (Bash/Zsh/Ksh93 only, though the example uses the Bash-specific {{{read -a}}} (Ksh93 and Zsh using {{{read -A}}} instead)):
Line 98: Line 141:
    # Bash
    echo "$foo" | read -r a b c # this doesn't work
    read -r a b c <<< "$foo" # but this does
     # Options:
     # -r Backslash does not act as an escape character for the word separators or line delimiter.
     # -a The words are assigned to sequential indices of the array "words"

     read -ra words <<< 'hi ho hum'
     printf 'total number of words: %d' "${#words[@]}"
Line 103: Line 149:
 Again, the pipeline causes the {{{read}}} command in the first example to run in a subshell, so its effect is never witnessed in the parent process. The second example does not create any subshells, so it works as we expect. The {{{<<<}}} operator is specific to bash (2.05b and later), and the input which follows it is usually called a "here string".  The {{{<<<}}} operator is available in Bash (2.05b and later), Zsh (where it was first introduced inspired from a similar operator in the Unix port of the {{{rc}}} shell), Ksh93 and Yash.
Line 105: Line 151:
 For more examples of how to break input into words, see [[BashFAQ/001|FAQ #1]].  * With a POSIX shell, or for longer multi-line data, you can use a here document instead:
Line 107: Line 153:
 With a Posix shell you can use a here document instead:
Line 109: Line 154:
   #Posix
   read -r a b c << EOF
   $foo
   EOF
    # POSIX
    linecount=0
    while IFS= read -r; do
        linecount=$((linecount+1))
    done <<EOF
    hi
    ho
    hum
    EOF

    printf 'total number of lines: %d\n' "$linecount"
Line 114: Line 166:

 * Use lastpipe (Bash 4.2)

 {{{
     # Bash 4.2
     # +m: Disable monitor mode (job control). Background processes display their
     # exit status upon completion when in monitor mode (we don't want that).
     set +m
     shopt -s lastpipe

     x=0
     printf '%s\n' hi{,,,,,} | while IFS= read -r "lines[x++]"; do :; done
     printf 'total number of lines: %d' "${#lines[@]}"
 }}}

 Bash 4.2 introduces the aforementioned ksh-like behavior to Bash. The one caveat is that job control must not be enabled, thereby limiting its usefulness in an interactive shell.

For more related examples of how to read input and break it into words, see [[BashFAQ/001|FAQ #1]].

----
CategoryShell

I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?

In most shells, each command of a pipeline is executed in a separate SubShell. Non-working example:

    # Works only in ksh88/ksh93, or zsh or bash 4.2 with lastpipe enabled
    # In other shells, this will print 0
    linecount=0

    printf '%s\n' foo bar |
    while IFS= read -r line
    do
        linecount=$((linecount + 1))
    done

    echo "total number of lines: $linecount"

The reason for this potentially surprising behaviour, as described above, is that each SubShell introduces a new variable context and environment. The while loop above is executed in a new subshell with its own copy of the variable linecount created with the initial value of '0' taken from the parent shell. This copy then is used for counting. When the while loop is finished, the subshell copy is discarded, and the original variable linecount of the parent (whose value hasn't changed) is used in the echo command.

Different shells exhibit different behaviors in this situation:

  • BourneShell creates a subshell when the input or output of anything (loops, case etc..) but a simple command is redirected, either by using a pipeline or by a redirection operator ('<', '>').

  • BASH, Yash and PDKsh-derived shells create a new process only if the loop is part of a pipeline.

  • KornShell and Zsh creates it only if the loop is part of a pipeline, but not if the loop is the last part of it. The read example above actually works in ksh88, ksh93, zsh! (but not MKsh or other PDKsh-derived shells)

  • POSIX specifies the bash behaviour, but as an extension allows any or all of the parts of the pipeline to run without a subshell (thus permitting the KornShell behaviour, as well).

More broken stuff:

    # Bash 4
    # The problem also occurs without a loop
    printf '%s\n' foo bar | mapfile -t line
    printf 'total number of lines: %s\n' "${#line[@]}" # prints 0

    f() {
        if [[ -t 0 ]]; then
            echo "$1"
        else
            read -r var
        fi
    };

    f 'hello' | f
    echo "$var" # prints nothing

Again, in both cases the pipeline causes read or some containing command to run in a subshell, so its effect is never witnessed in the parent process.

  • It should be stressed that this issue isn't specific to loops. It's a general property of all pipes, though the "while/read" loop might be considered the canonical example that crops up over and over when people read the help or manpage description of the read builtin and notice that it accepts data on stdin. They might recall that data redirected into a compound command is available throughout that command, but not understand why all the fancy process substitutions and redirects they run across in places like FAQ #1 are necessary. Naturally they proceed to put their funstuff directly into a pipeline, and confusion ensues.

Workarounds

  • If the input is a file, a simple redirect will suffice:
        # POSIX
        while IFS= read -r line; do linecount=$((linecount + 1)); done < file
        echo "$linecount"

    Unfortunately, this doesn't work with a Bourne shell; see sh(1) from the Heirloom Bourne Shell for a workaround.

  • Use command grouping and do everything in the subshell:

        # POSIX
        linecount=0
    
        cat /etc/passwd |
        {
            while IFS= read -r line
            do
                linecount=$((linecount + 1))
            done
    
            echo "total number of lines: $linecount" ;
        }
    This doesn't really change the subshell situation, but if nothing from the subshell is needed in the rest of your code then destroying the local environment after you're through with it could be just what you want anyway.
  • Use ProcessSubstitution (Bash/Zsh/Ksh93 only):

        # Bash/Ksh93/Zsh
        while IFS= read -r line
        do
            ((linecount++))
        done < <(grep PATH /etc/profile)
    
        echo "total number of lines: $linecount"
    This is essentially identical to the first workaround above. We still redirect a file, only this time the file happens to be a named pipe temporarily created by our process substitution to transport the output of grep.
  • Use a named pipe:

        # POSIX
        mkfifo mypipe
        grep PATH /etc/profile > mypipe &
    
        while IFS= read -r line
        do
            linecount=$((linecount + 1))
        done < mypipe
    
        echo "total number of lines: $linecount"
  • Use a coprocess (ksh, even pdksh, oksh, mksh..):

        # ksh
        grep PATH /etc/profile |&
    
        while IFS= read -r -p line
        do
            linecount=$((linecount + 1))
        done
    
        echo "total number of lines: $linecount"
        # bash>4
        coproc grep PATH /etc/profile
    
        while IFS= read -r line
        do
            linecount=$((linecount + 1))
        done <&${COPROC[0]} 
    
        echo "total number of lines: $linecount"
  • Use a HereString (Bash/Zsh/Ksh93 only, though the example uses the Bash-specific read -a (Ksh93 and Zsh using read -A instead)):

         # Options:
         # -r Backslash does not act as an escape character for the word separators or line delimiter.
         # -a The words are assigned to sequential indices of the array "words"
    
         read -ra words <<< 'hi ho hum'
         printf 'total number of words: %d' "${#words[@]}"

    The <<< operator is available in Bash (2.05b and later), Zsh (where it was first introduced inspired from a similar operator in the Unix port of the rc shell), Ksh93 and Yash.

  • With a POSIX shell, or for longer multi-line data, you can use a here document instead:
        # POSIX
        linecount=0
        while IFS= read -r; do
            linecount=$((linecount+1))
        done <<EOF
        hi
        ho
        hum
        EOF
    
        printf 'total number of lines: %d\n' "$linecount"
  • Use lastpipe (Bash 4.2)
         # Bash 4.2
         # +m: Disable monitor mode (job control). Background processes display their
         #     exit status upon completion when in monitor mode (we don't want that).
         set +m
         shopt -s lastpipe
    
         x=0
         printf '%s\n' hi{,,,,,} | while IFS= read -r "lines[x++]"; do :; done
         printf 'total number of lines: %d' "${#lines[@]}"
    Bash 4.2 introduces the aforementioned ksh-like behavior to Bash. The one caveat is that job control must not be enabled, thereby limiting its usefulness in an interactive shell.

For more related examples of how to read input and break it into words, see FAQ #1.


CategoryShell

BashFAQ/024 (last edited 2023-12-12 13:15:33 by 195)