Differences between revisions 16 and 43 (spanning 27 versions)
Revision 16 as of 2011-09-10 11:00:11
Size: 6351
Editor: ormaaj
Comment: spelling
Revision 43 as of 2023-12-12 13:15:33
Size: 6582
Editor: 195
Comment: clarify when/why +m is needed
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
== I set variables in a loop. Why do they disappear after the loop terminates? Or, why can't I pipe data to read? == == I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read? ==
In most shells, each command of a pipeline is executed in a separate SubShell. Non-working example:
Line 4: Line 5:
=== The problem === {{{
# Works only in ksh88/ksh93, or zsh or bash 4.2 with lastpipe enabled
# In other shells, this will print 0
linecount=0
Line 6: Line 10:
Each command of a pipeline of at least two commands - where "command" can be any of: a simple or compound command, or pipeline - is executed asynchronously in a subshell. Or more simply, in most shells, each chunk of code separated by a pipe operator, including [[/CompoundCommands|compound commands]] (which includes {{{while/for/until}}} loops) are forked off and executed at the same time in separate SubShell processes, which like all subshells, each have their own isolated environment and variable scope. printf '%s\n' foo bar |
while IFS= read -r line
do
    linecount=$((linecount + 1))
done
Line 8: Line 16:
Non-working example:
{{{
    # Works only in ksh88/ksh93
    typeset -i linecnt=0
    printf '%s\n' foo bar | while read -r line

    do
        linecnt=$((linecnt+1))
    done
    printf 'total number of lines: %s\n' "$linecnt" # prints 0
echo "total number of lines: $linecount"
Line 20: Line 19:
The reason for this potentially surprising behaviour, as described above, is that each SubShell introduces a new variable context and environment. The {{{while}}} loop above is executed in a new subshell with its own copy of the variable {{{linecnt}}} created with the initial value of '0' taken from the parent shell. This copy then is used for counting. When the {{{while}}} loop is finished, the subshell copy is discarded, and the original variable {{{linecnt}}} of the parent (whose value hasn't changed) is used in the {{{echo}}} command. The reason for this potentially surprising behaviour, as described above, is that each SubShell introduces a new variable context and environment. The `while` loop above is executed in a new subshell with its own copy of the variable `linecount` created with the initial value of '0' taken from the parent shell. This copy then is used for counting. When the `while` loop is finished, the subshell copy is discarded, and the original variable `linecount` of the parent (whose value hasn't changed) is used in the `echo` command.
Line 24: Line 23:
 * [[BASH]] creates a new process only if the loop is part of a pipeline.
 * KornShell creates it only if the loop is part of a pipeline, but ''not'' if the loop is the last part of it. The read example above actually ''works'' in ksh88 and ksh93! (but not mksh)
 * [[BASH]], Yash and PDKsh-derived shells create a new process only if the loop is part of a pipeline.
 * KornShell and Zsh creates it only if the loop is part of a pipeline, but ''not'' if the loop is the last part of it. The read example above actually ''works'' in ksh88, ksh93, zsh! (but not MKsh or other PDKsh-derived shells)
Line 30: Line 29:
    # Bash 4
    # The problem also occurs without a loop
    printf '%s\n' foo bar | mapfile -t line  
    printf 'total number of lines: %s\n' "${#line[@]}" # prints 0
# Bash 4
# The problem also occurs without a loop
printf '%s\n' foo bar | mapfile -t line
printf 'total number of lines: %s\n' "${#line[@]}" # prints 0
Line 37: Line 36:
    f() {
     if [[ -t 0 ]]; then
            echo "$1"
        else
     read -r var
        fi
    };
f() {
    if [[ -t 0 ]]; then
        echo "$1"
    else
        read -r var
    fi
}
Line 45: Line 44:
    f 'hello' | f
    echo "$var" # prints nothing
f 'hello' | f
echo "$var" # prints nothing
Line 49: Line 48:
Again, in both cases the pipeline causes {{{read}}} or some containing command to run in a subshell, so its effect is never witnessed in the parent process. Again, in both cases the pipeline causes `read` or some containing command to run in a subshell, so its effect is never witnessed in the parent process.
Line 51: Line 50:
 It should be stressed that this issue isn't specific to loops. It's a general property of all pipes, though the "{{{while/read}}}" loop might be considered the canonical example that crops up over and over when people read the help or manpage description of the {{{read}}} builtin and notice that it accepts data on stdin. They might recall that data redirected into a compound command is available throughout that command, but not understand why all the fancy process substitutions and redirects they run across in places like [[BashFAQ/001|FAQ #1]] are necessary. Naturally they proceed to put their funstuff directly into a pipeline, and confusion ensues.  It should be stressed that this issue isn't specific to loops. It's a general property of all pipes, though the `while/read` loop might be considered the canonical example that crops up over and over when people read the help or manpage description of the `read` builtin and notice that it accepts data on stdin. They might recall that data redirected into a compound command is available throughout that command, but not understand why all the fancy process substitutions and redirects they run across in places like [[BashFAQ/001|FAQ #1]] are necessary. Naturally they proceed to put their funstuff directly into a pipeline, and confusion ensues.
Line 58: Line 57:
    # POSIX
    while read -r line; do linecnt=$(($linecnt+1)); done < file
    echo $linecnt
 }}}
# POSIX
while IFS= read -r line; do linecount=$((linecount + 1)); done < file
echo "$linecount"
}}}
Line 68: Line 67:
    # POSIX
    linecnt=0
    cat /etc/passwd | {
    while read -r line ; do
        linecnt=$((linecnt+1))
# POSIX
linecount=0

cat /etc/passwd |
{
    while IFS= read -r line
   do
        linecount=$((linecount + 1))
Line 74: Line 76:
    echo "total number of lines: $linecnt"
    }
 }}}

echo "total number of lines: $linecount"
}
}}}
Line 80: Line 83:
 * Use ProcessSubstitution (Bash only):  * Use ProcessSubstitution (Bash/Zsh/Ksh93 only):
Line 83: Line 86:
    # Bash
    while read -r line; do
     ((linecnt++))
    done < <(grep PATH /etc/profile)
    echo "total number of lines: $linecnt"
 }}}
# Bash/Ksh93/Zsh
while IFS= read -r line
do
    ((linecount++))
done < <(grep PATH /etc/profile)

echo "total number of lines: $linecount"
}}}
Line 95: Line 100:
    # POSIX
    mkfifo mypipe
    grep PATH /etc/profile > mypipe &
    while read -r line;do
        linecnt=$(($linecnt+1))
    done < mypipe
    echo "total number of lines: $linecnt"
 }}}
# POSIX
mkfifo mypipe
grep PATH /etc/profile > mypipe &
Line 104: Line 104:
 * Use a '''coprocess''' (ksh, even pdksh, bash 4, oksh, mksh..): while IFS= read -r line
do
    linecount=$((linecount + 1))
done < mypipe

echo "total number of lines: $linecount"
}}}

 * Use a [[http://wiki.bash-hackers.org/syntax/keywords/coproc|coprocess]] (ksh, even pdksh, oksh, mksh..):
Line 107: Line 115:
    # ksh
    grep PATH /etc/profile |&
    while read -r -p line; do
        linecnt=$((linecnt+1))
    done
    echo "total number of lines: $linecnt"
 }}}
# ksh
grep PATH /etc/profile |&
Line 115: Line 118:
 * Use a HereString (Bash only): while IFS= read -r -p line
do
    linecount=$((linecount + 1))
done

echo "total number of lines: $linecount"
}}}
Line 118: Line 127:
     read -ra words <<< 'hi ho hum'
     printf 'total number of words: %d' "${#words[@]}"
 }}}
# bash>4
coproc grep PATH /etc/profile
Line 122: Line 130:
 The {{{<<<}}} operator is specific to bash (2.05b and later), however it is a very clean and handy way to specify a small string of literal input to a command. while IFS= read -r line
do
    linecount=$((linecount + 1))
done <&"${COPROC[0]}"

echo "total number of lines: $linecount"
}}}

 * Use a HereString (Bash/Zsh/Ksh93 only, though the example uses the Bash-specific {{{read -a}}} (Ksh93 and Zsh using {{{read -A}}} instead)):

 {{{
# Options:
# -r Backslash does not act as an escape character for the word separators or line delimiter.
# -a The words are assigned to sequential indices of the array "words"

read -ra words <<< 'hi ho hum'
printf 'total number of words: %d\n' "${#words[@]}"
}}}

 The `<<<` operator is available in Bash (2.05b and later), Zsh (where it was first introduced inspired from a similar operator in the Unix port of the `rc` shell), Ksh93 and Yash.
Line 127: Line 154:
    # Bash
    declare -i linecnt
    while read -r; do
        ((linecnt++))
    done <<EOF
    hi
    ho
    hum
    EOF
    printf 'total number of lines: %d' "$linecnt"
 }}}
# POSIX
linecount=0
while IFS= read -r line; do
    linecount=$((linecount+1))
done <<EOF
hi
ho
hum
EOF

printf 'total number of lines: %d\n' "$linecount"
}}}
Line 142: Line 170:
     # Bash 4.2
     set +m
     shopt -s lastpipe
# Bash 4.2
# +m: Disable monitor mode (job control) in an interactive shell since it is
# on by default there and it needs to be disabled for lastpipe to work.
set +m
shopt -s lastpipe
Line 146: Line 176:
     printf '%s\n' hi{,,,,,} | while read -r "lines[x++]"; do :; done
     printf 'total number of lines: %d' "${#lines[@]}"
 }}}
x=0
printf '%s\n' hi{,,,,,} | while IFS= read -r 'lines[x++]'; do :; done
printf 'total number of lines: %d\n' "${#lines[@]}"
}}}

I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?

In most shells, each command of a pipeline is executed in a separate SubShell. Non-working example:

# Works only in ksh88/ksh93, or zsh or bash 4.2 with lastpipe enabled
# In other shells, this will print 0
linecount=0

printf '%s\n' foo bar |
while IFS= read -r line
do
    linecount=$((linecount + 1))
done

echo "total number of lines: $linecount"

The reason for this potentially surprising behaviour, as described above, is that each SubShell introduces a new variable context and environment. The while loop above is executed in a new subshell with its own copy of the variable linecount created with the initial value of '0' taken from the parent shell. This copy then is used for counting. When the while loop is finished, the subshell copy is discarded, and the original variable linecount of the parent (whose value hasn't changed) is used in the echo command.

Different shells exhibit different behaviors in this situation:

  • BourneShell creates a subshell when the input or output of anything (loops, case etc..) but a simple command is redirected, either by using a pipeline or by a redirection operator ('<', '>').

  • BASH, Yash and PDKsh-derived shells create a new process only if the loop is part of a pipeline.

  • KornShell and Zsh creates it only if the loop is part of a pipeline, but not if the loop is the last part of it. The read example above actually works in ksh88, ksh93, zsh! (but not MKsh or other PDKsh-derived shells)

  • POSIX specifies the bash behaviour, but as an extension allows any or all of the parts of the pipeline to run without a subshell (thus permitting the KornShell behaviour, as well).

More broken stuff:

# Bash 4
# The problem also occurs without a loop
printf '%s\n' foo bar | mapfile -t line
printf 'total number of lines: %s\n' "${#line[@]}" # prints 0

f() {
    if [[ -t 0 ]]; then
        echo "$1"
    else
        read -r var
    fi
}

f 'hello' | f
echo "$var" # prints nothing

Again, in both cases the pipeline causes read or some containing command to run in a subshell, so its effect is never witnessed in the parent process.

It should be stressed that this issue isn't specific to loops. It's a general property of all pipes, though the while/read loop might be considered the canonical example that crops up over and over when people read the help or manpage description of the read builtin and notice that it accepts data on stdin. They might recall that data redirected into a compound command is available throughout that command, but not understand why all the fancy process substitutions and redirects they run across in places like FAQ #1 are necessary. Naturally they proceed to put their funstuff directly into a pipeline, and confusion ensues.

Workarounds

  • If the input is a file, a simple redirect will suffice:
    # POSIX
    while IFS= read -r line; do linecount=$((linecount + 1)); done < file
    echo "$linecount"

    Unfortunately, this doesn't work with a Bourne shell; see sh(1) from the Heirloom Bourne Shell for a workaround.

  • Use command grouping and do everything in the subshell:

    # POSIX
    linecount=0
    
    cat /etc/passwd |
    {
        while IFS= read -r line
        do
            linecount=$((linecount + 1))
        done
    
        echo "total number of lines: $linecount"
    }
    This doesn't really change the subshell situation, but if nothing from the subshell is needed in the rest of your code then destroying the local environment after you're through with it could be just what you want anyway.
  • Use ProcessSubstitution (Bash/Zsh/Ksh93 only):

    # Bash/Ksh93/Zsh
    while IFS= read -r line
    do
        ((linecount++))
    done < <(grep PATH /etc/profile)
    
    echo "total number of lines: $linecount"
    This is essentially identical to the first workaround above. We still redirect a file, only this time the file happens to be a named pipe temporarily created by our process substitution to transport the output of grep.
  • Use a named pipe:

    # POSIX
    mkfifo mypipe
    grep PATH /etc/profile > mypipe &
    
    while IFS= read -r line
    do
        linecount=$((linecount + 1))
    done < mypipe
    
    echo "total number of lines: $linecount"
  • Use a coprocess (ksh, even pdksh, oksh, mksh..):

    # ksh
    grep PATH /etc/profile |&
    
    while IFS= read -r -p line
    do
        linecount=$((linecount + 1))
    done
    
    echo "total number of lines: $linecount"
    # bash>4
    coproc grep PATH /etc/profile
    
    while IFS= read -r line
    do
        linecount=$((linecount + 1))
    done <&"${COPROC[0]}"
    
    echo "total number of lines: $linecount"
  • Use a HereString (Bash/Zsh/Ksh93 only, though the example uses the Bash-specific read -a (Ksh93 and Zsh using read -A instead)):

    # Options:
    # -r Backslash does not act as an escape character for the word separators or line delimiter.
    # -a The words are assigned to sequential indices of the array "words"
    
    read -ra words <<< 'hi ho hum'
    printf 'total number of words: %d\n' "${#words[@]}"

    The <<< operator is available in Bash (2.05b and later), Zsh (where it was first introduced inspired from a similar operator in the Unix port of the rc shell), Ksh93 and Yash.

  • With a POSIX shell, or for longer multi-line data, you can use a here document instead:
    # POSIX
    linecount=0
    while IFS= read -r line; do
        linecount=$((linecount+1))
    done <<EOF
    hi
    ho
    hum
    EOF
    
    printf 'total number of lines: %d\n' "$linecount"
  • Use lastpipe (Bash 4.2)
    # Bash 4.2
    # +m: Disable monitor mode (job control) in an interactive shell since it is
    # on by default there and it needs to be disabled for lastpipe to work.
    set +m
    shopt -s lastpipe
    
    x=0
    printf '%s\n' hi{,,,,,} | while IFS= read -r 'lines[x++]'; do :; done
    printf 'total number of lines: %d\n' "${#lines[@]}"
    Bash 4.2 introduces the aforementioned ksh-like behavior to Bash. The one caveat is that job control must not be enabled, thereby limiting its usefulness in an interactive shell.

For more related examples of how to read input and break it into words, see FAQ #1.


CategoryShell

BashFAQ/024 (last edited 2023-12-12 13:15:33 by 195)