The correct answer to every exercise is actually "because set -e is crap".

However, some people want more detailed explanations. So, here you go:


Exercise 1: why doesn't this example print anything?

   1 #!/usr/bin/env bash
   2 set -e
   3 i=0
   4 let i++
   5 echo "i is $i"

According to the manual, set -e exits "if a simple command (see SHELL GRAMMAR above) exits with a non-zero status. The shell does not exit if the command that fails is part of the command list immediately following a while or until keyword, part of the test in a if statement, part of an && or || list, or if the command's return value is being inverted via !".

The let command is a simple command, and it doesn't qualify for any of the exceptions in the above list. Moreover, help let tells us "If the last ARG evaluates to 0, let returns 1; 0 is returned otherwise." i++ evaluates to 0, so let i++ returns 1 and trips the set -e. The script aborts. Because we added 1 to a variable.


Exercise 2: why does this one sometimes appear to work? In which versions of bash does it work, and in which versions does it fail?

   1 #!/usr/bin/env bash
   2 set -e
   3 i=0
   4 ((i++))
   5 echo "i is $i"

((...)) does not qualify as a simple command according to the shell grammar. So it is not eligible to trigger a set -e abort, even though it still returns 1 in this particular instance (because i++ evaluates to 0 while setting i to 1, and because 0 is considered false in a math context).

However, this behavior changed in bash 4.1. Exercise 2 works only in bash 4.0 and earlier! In bash 4.1, ((...)) qualifies for set -e abortion, and this exercise will print nothing, the same as Exercise 1.

This reinforces my point about how unreliable set -e is. You can't even count on it to behave consistently across point-releases of a shell.


Exercise 3: why aren't these two scripts identical?

   1 #!/usr/bin/env bash
   2 set -e
   3 test -d nosuchdir && echo no dir
   4 echo survived

   1 #!/usr/bin/env bash
   2 set -e
   3 fn() { test -d nosuchdir && echo no dir; }
   4 fn
   5 echo survived

In the first script, the test command is "part of any command executed in a && or || list except the command following the final && or ||" (Bash 4.2 man page), so it does not cause the shell to exit.

In the second script, that is also true, so the shell does not exit immediately after the test ...&& command. However, the function fn returns 1 (failure) because that was the exit status of the last command executed in the function. The simple command fn in the main body of the script therefore returns 1 (failure), which causes the shell to exit.


Exercise 4: why aren't these two scripts identical?

   1 set -e
   2 fn() { test -d nosuchdir && echo no dir; }
   3 fn
   4 echo survived

   1 set -e
   2 fn() { if test -d nosuchdir; then echo no dir; fi; }
   3 fn
   4 echo survived

The first script above is the same as the second script from exercise 3. See previous answer for an explanation of that one.

In the second script, we observe one of the ways in which if and && are not the same. In the manual, under Compound Commands, we find this sentence in the definition of if:

Since the terms "command" and "condition" aren't particularly differentiated in the man page, following the first part of this explanation from man bash, it might seem as though the exit status of the if structure should be the exit status of the test command, however, in this case the test command is a condition of the if logical structure, so the second part of the above explanation applies. If test, or any list of commands in the conditional position of the logical structure, returns an exit code unequal to 0, and there are no elif or else branches, then the exit code of the entire if structure is still 0.

Worded a little more clearly, since the test is not true, and no subsequent commands are executed, if must return 0. This means fn returns 0, and the shell does not exit.


Exercise 5: under what conditions will this fail?

   1 set -e
   2 read -r foo < configfile

Obviously, this will abort if configfile is missing or unreadable. It will also abort (probably unexpectedly) if the file is missing a terminating newline. This happens because read returns a failure code when it reaches end of file before reading the expected newline. However, the file's contents are still read, and the variable is still populated.

Without the set -e, the script would populate the variable correctly and move on, and the fact that the file is "incomplete" wouldn't be an issue. Note that using a while or if list with read will populate the variable either way, as both are exceptions by set -e rules.

BashFAQ/105/Answers (last edited 2022-08-26 20:03:23 by 138)