Differences between revisions 1 and 21 (spanning 20 versions)
Revision 1 as of 2010-08-18 16:20:29
Size: 1433
Editor: GreyCat
Comment: Why doesn't set -e (set -o errexit) do what I expected?
Revision 21 as of 2016-12-09 23:53:20
Size: 5239
Editor: AladW
Comment: put output as comments
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
== Why doesn't set -e (set -o errexit) do what I expected? ==
Line 4: Line 3:
`set -e` was an attempt to add "automatic error detection" to the shell. Its ''goal'' was to cause the shell to abort any time an error occurred, so you don't have to put `|| exit 1` after each important command. == Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected? ==
`set -e` w
as an attempt to add "automatic error detection" to the shell. Its goal was to cause the shell to abort any time an error occurred, so you don't have to put `|| exit 1` after each important command.
Line 6: Line 6:
That goal is non-trivial, because many commands are ''supposed'' to return non-zero. For example, That goal is non-trivial, because many commands intentionally return non-zero. For example,
Line 9: Line 9:
  if [ -d /foo ]; then
   
...
 
else
   
...
 
fi
  if [ -d /foo ]; then ...; else ...; fi
Line 15: Line 11:
Clearly we don't want to abort when the conditional, `[ -d /foo ]`, returns non-zero (because the directory does not exist) -- our script wants to handle that in the `else` part. So the implementors decided to make a bunch of special rules, like "commands that are part of an `if` test are immune", or "commands in a pipeline, other than the last one, are immune".
Line 16: Line 13:
If the `[ -d /foo ]` command triggers the `set -e` abort when it returns non-zero (because the directory does not exist -- a case our script wants to handle in the `else` part), then obviously `set -e` isn't very useful. So the implementors decided to make a bunch of special rules, like "commands that are part of an `if` test are immune", or "commands in a pipeline, other than the last one, are immune". These rules are extremely convoluted, and they still fail to catch even some remarkably simple cases. Even worse, the rules ''change'' from one Bash version to another, as Bash attempts to track the extremely slippery POSIX definition of this "feature". When a SubShell is involved, it gets worse still -- the behavior changes depending on whether Bash is invoked in POSIX mode. [[http://fvue.nl/wiki/Bash:_Error_handling|Another wiki]] has a page that covers this in more detail. Be sure to check the caveats.
Line 18: Line 15:
These rules are extremely convoluted. Worse, they ''change'' from one Bash version to another, as Bash attempts to track the extremely slippery POSIX definition of this "feature". When a SubShell is involved, it gets worse still. The behavior changes depending on whether Bash is invoked in POSIX mode. [[http://fvue.nl/wiki/Bash:_Error_handling|Another wiki]] has a page that covers this in more detail. Be sure to check the caveats. ''Exercise for the reader: why doesn't this example print anything?''
Line 20: Line 17:
GreyCat's personal recommendation is simple: don't use it. Add your own explicit error checking instead. {{{#!highlight bash
#!/usr/bin/env bash
set -e
i=0
let i++
echo "i is $i"
}}}
''Exercise 2: why does '''this''' one sometimes appear to work? In which versions of bash does it work, and in which versions does it fail?''

{{{#!highlight bash
#!/usr/bin/env bash
set -e
i=0
((i++))
echo "i is $i"
}}}
''Exercise 3: why aren't these two scripts identical?''

{{{#!highlight bash
#!/usr/bin/env bash
set -e
test -d nosuchdir && echo no dir
echo survived
}}}
{{{#!highlight bash
#!/usr/bin/env bash
set -e
f() { test -d nosuchdir && echo no dir; }
f
echo survived
}}}
''Exercise 4: why aren't '''these''' two scripts identical?''

{{{#!highlight bash
set -e
f() { test -d nosuchdir && echo no dir; }
f
echo survived
}}}
{{{#!highlight bash
set -e
f() { if test -d nosuchdir; then echo no dir; fi; }
f
echo survived
}}}
''Exercise 5: under what conditions will this fail?''

{{{#!highlight bash
set -e
read -r foo < configfile
}}}
([[BashFAQ/105/Answers|Answers]])

Even if you use `expr(1)` (which we ''do not'' recommend -- use [[ArithmeticExpression|arithmetic expressions]] instead), you still run into the same problem:

{{{#!highlight bash
set -e
foo=$(expr 1 - 1)
# The following command will not be executed:
echo survived
}}}
Subshells from command substitution unset `set -e`, however (unless `inherit_errexit` is set with Bash 4.4):

{{{#!highlight bash
set -e
foo=$(expr 1 - 1; true)
# Will run:
echo survived
}}}
Note that set -e is '''not''' unset for commands that are run asynchronously, for example with process substitution:

{{{#!highlight bash
set -e
mapfile foo < <(true; echo foo)
echo ${foo[-1]} # foo
mapfile foo < <(false; echo foo)
echo ${foo[-1]} # bash: foo: bad array subscript
}}}
Another pitfall associated with `set -e` occurs when you use commands that ''look like'' assignments but aren't, such as `export`, `declare`, `typeset` or `local`.

{{{#!highlight bash
set -e
f() { local var=$(somecommand that fails); }
f # will not exit

g() { local var; var=$(somecommand that fails); }
g # will exit
}}}
In function `f`, the exit status of `somecommand` is discarded. It won't trigger the `set -e` because the exit status of `local` masks it (the assignment to the variable succeeds, so `local` returns status 0). In function `g`, the `set -e` is triggered because it uses a ''real'' assignment which returns the exit status of `somecommand`.

Using [[ProcessSubstitution|Process substitution]], the exit code is also discarded as it is not visible from the main script:

{{{#!highlight bash
set -e
cat <(somecommand that fails)
echo survived
}}}
Using a pipe makes no difference, as only the ''rightmost'' process is considered:

{{{#!highlight bash
set -e
somecommand that fails | cat -
echo survived
}}}
`set -o pipefail` is a workaround by returning the exit code of the ''first'' failed process:

{{{#!highlight bash
set -e -o pipefail
failcmd1 | failcmd2 | cat -
# The following command will not be executed:
echo survived
}}}
though with pipefail in effect, code like this will sometimes cause an error, depending on whether the output of somecmd exceeds the size of the pipe buffer or not:

{{{#!highlight bash
set -e -o pipefail
somecmd | head -n1
# The following command will sometimes be executed, depending on how much output somecmd writes:
echo survived
}}}
GreyCat's personal recommendation is simple: don't use `set -e`. Add your own error checking instead.

rking's personal recommendation is to go ahead and use `set -e`, but beware of possible gotchas. It has useful semantics, so to exclude it from the toolbox is to give into FUD.

geirha's personal recommendation is to handle errors properly and not rely on the unreliable `set -e`.

Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?

set -e was an attempt to add "automatic error detection" to the shell. Its goal was to cause the shell to abort any time an error occurred, so you don't have to put || exit 1 after each important command.

That goal is non-trivial, because many commands intentionally return non-zero. For example,

  if [ -d /foo ]; then ...; else ...; fi

Clearly we don't want to abort when the conditional, [ -d /foo ], returns non-zero (because the directory does not exist) -- our script wants to handle that in the else part. So the implementors decided to make a bunch of special rules, like "commands that are part of an if test are immune", or "commands in a pipeline, other than the last one, are immune".

These rules are extremely convoluted, and they still fail to catch even some remarkably simple cases. Even worse, the rules change from one Bash version to another, as Bash attempts to track the extremely slippery POSIX definition of this "feature". When a SubShell is involved, it gets worse still -- the behavior changes depending on whether Bash is invoked in POSIX mode. Another wiki has a page that covers this in more detail. Be sure to check the caveats.

Exercise for the reader: why doesn't this example print anything?

   1 #!/usr/bin/env bash
   2 set -e
   3 i=0
   4 let i++
   5 echo "i is $i"

Exercise 2: why does this one sometimes appear to work? In which versions of bash does it work, and in which versions does it fail?

   1 #!/usr/bin/env bash
   2 set -e
   3 i=0
   4 ((i++))
   5 echo "i is $i"

Exercise 3: why aren't these two scripts identical?

   1 #!/usr/bin/env bash
   2 set -e
   3 test -d nosuchdir && echo no dir
   4 echo survived

   1 #!/usr/bin/env bash
   2 set -e
   3 f() { test -d nosuchdir && echo no dir; }
   4 f
   5 echo survived

Exercise 4: why aren't these two scripts identical?

   1 set -e
   2 f() { test -d nosuchdir && echo no dir; }
   3 f
   4 echo survived

   1 set -e
   2 f() { if test -d nosuchdir; then echo no dir; fi; }
   3 f
   4 echo survived

Exercise 5: under what conditions will this fail?

   1 set -e
   2 read -r foo < configfile

(Answers)

Even if you use expr(1) (which we do not recommend -- use arithmetic expressions instead), you still run into the same problem:

   1 set -e
   2 foo=$(expr 1 - 1)
   3 # The following command will not be executed:
   4 echo survived

Subshells from command substitution unset set -e, however (unless inherit_errexit is set with Bash 4.4):

   1 set -e
   2 foo=$(expr 1 - 1; true)
   3 # Will run:
   4 echo survived

Note that set -e is not unset for commands that are run asynchronously, for example with process substitution:

   1 set -e
   2 mapfile foo < <(true; echo foo)
   3 echo ${foo[-1]} # foo
   4 mapfile foo < <(false; echo foo)
   5 echo ${foo[-1]} # bash: foo: bad array subscript

Another pitfall associated with set -e occurs when you use commands that look like assignments but aren't, such as export, declare, typeset or local.

   1 set -e
   2 f() { local var=$(somecommand that fails); }
   3 f    # will not exit
   4 
   5 g() { local var; var=$(somecommand that fails); }
   6 g    # will exit

In function f, the exit status of somecommand is discarded. It won't trigger the set -e because the exit status of local masks it (the assignment to the variable succeeds, so local returns status 0). In function g, the set -e is triggered because it uses a real assignment which returns the exit status of somecommand.

Using Process substitution, the exit code is also discarded as it is not visible from the main script:

   1 set -e
   2 cat <(somecommand that fails)
   3 echo survived

Using a pipe makes no difference, as only the rightmost process is considered:

   1 set -e
   2 somecommand that fails | cat -
   3 echo survived

set -o pipefail is a workaround by returning the exit code of the first failed process:

   1 set -e -o pipefail
   2 failcmd1 | failcmd2 | cat -
   3 # The following command will not be executed:
   4 echo survived

though with pipefail in effect, code like this will sometimes cause an error, depending on whether the output of somecmd exceeds the size of the pipe buffer or not:

   1 set -e -o pipefail
   2 somecmd | head -n1
   3 # The following command will sometimes be executed, depending on how much output somecmd writes:
   4 echo survived

GreyCat's personal recommendation is simple: don't use set -e. Add your own error checking instead.

rking's personal recommendation is to go ahead and use set -e, but beware of possible gotchas. It has useful semantics, so to exclude it from the toolbox is to give into FUD.

geirha's personal recommendation is to handle errors properly and not rely on the unreliable set -e.

BashFAQ/105 (last edited 2021-03-11 06:07:25 by dsl-66-36-156-249)