Sending and Trapping Signals

Signals are a basic tool for asynchronous interprocess communication. What that means is one process (A) can tell another process (B) to do something, at a time chosen by process A rather than process B. (Compare to process B looking for a file every few seconds; this is called polling, and the timing is controlled by process B, rather than process A.)

The operating system provides a finite number of signals which can be "sent" to tell a process to do something. The signals do not carry any additional information; the only information the process gets is which signal was received. The process does not even know who sent the signal.

Unless a process takes special action in advance, most signals are fatal; that is, the default action a process will perform upon receiving a signal is an immediate exit. (Exceptions: SIGCHLD is ignored by default, SIGSTOP pauses the process, and SIGCONT resumes the process.) Some signals (such as SIGQUIT) also cause a process to leave a core file, in addition to exiting.

1. Traps, or Signal Handlers

A process may choose to perform a different action, rather than exiting, upon receiving a signal. This is done by setting up a signal handler (or trap). The trap must be set before the signal is received. A process that receives a signal for which it has set a trap is said to have caught the signal.

The simplest signal handling a process can choose to perform is to ignore a signal. This is generally a bad idea, unless it is done for a very specific purpose. Ignoring signals often leads to runaway processes which consume all available CPU.

More commonly, traps can be set up to intercept a fatal signal, perform cleanup, and then exit gracefully. For example, a program that creates temporary files might wish to remove them before exiting. If the program is forced to exit by a signal, it won't be able to remove the files unless it catches the signal.

In a shell script, the command to set up a signal handler is trap. The trap command has 5 different ways to be used:

Signals may be specified using a number, or using a symbolic name. The symbolic name is greatly preferred for POSIX or Bash scripts, because the mapping from signal numbers to actual signals can vary slightly across operating systems. For Bourne shells, the numbers may be required.

There is a core set of signals common to all Unix-like operating systems whose numbers realistically never change; the most common of these are:

Name

Number

Meaning

HUP

1

Hang Up. The controlling terminal has gone away.

INT

2

Interrupt. The user has pressed the interrupt key (usually Ctrl-C or DEL).

QUIT

3

Quit. The user has pressed the quit key (usually Ctrl-\). Exit and dump core.

KILL

9

Kill. This signal cannot be caught or ignored. Unconditionally fatal. No cleanup possible.

TERM

15

Terminate. This is the default signal sent by the kill command.

EXIT

0

Not really a signal. In a shell script, an EXIT trap is run on any exit, signalled or not.

Bash accepts signal names with or without a leading SIG for both the trap and kill builtin commands unless it is in POSIX mode when kill does not accept a leading SIG but trap does.

Thus the leading SIG is never required so SIGHUP can always be trapped by using trap ... HUP and sent by using kill -HUP process_ID.

The special name EXIT is defined by POSIX and is preferred for any signal handler that simply wants to clean up upon exiting, rather than doing anything complex. Using 0 instead of EXIT is also allowed in a trap command (but 0 is not a valid signal number, and kill -0 has a completely different meaning). So to clean up, just trap EXIT and call a cleanup function from there. Don't trap a bunch of signals. Sadly, this only seems to work in Bash. Other shells, such as zsh or dash, trigger the EXIT trap only if no other signal caused the exit. One further caveat: This only works in non-interactive shells (scripts), EXIT is not called if interactive shells get signalled (possibly a bug).

If you are asking a program to terminate, you should always use SIGTERM (simply kill process_ID). This will give the program a chance to catch the signal and clean up. If you use SIGKILL, the program cannot clean up, and may leave files in a corrupted state.

Please see ProcessManagement for a more thorough explanation of how processes work and interact.

2. Examples

This is the basic method for setting up a trap to clean up temporary files:

#!/bin/sh
tempfile=$(mktemp)
trap 'rm -f "$tempfile"' EXIT
...

This example defines a signal handler that will re-read a configuration file on SIGHUP. This is a common technique used by long-running daemon processes, so that they do not need to be restarted from scratch when a configuration variable is changed.

#!/bin/sh
config=/etc/myscript/config
read_config() { test -r "$config" && . "$config"; }
read_config
trap 'read_config' HUP
while true; do
  ...

Setting a trap overwrites a previous trap on a given signal. There's no direct way to access the command associated with a trap as a string in order to save and restore the state of a trap. However, trap creates properly escaped output that's safe for reuse. The POSIX-recommended method is:

traps="$(trap)"
...
eval "$traps"

3. When is the signal handled?

When bash is executing an external command in the foreground, it does not handle any signals received until the foreground process terminates. This is important when you have a script like this:

trap 'echo "doing some cleaning"' EXIT
echo waiting a bit
sleep 10000

If you kill the script (using kill pid from another terminal, not with Ctrl-C -- more on that later), bash will wait for sleep to exit before calling the trap. That's probably not what you expect. A workaround is to use a builtin that will be interrupted, such as wait:

trap 'echo "doing some cleaning"' EXIT
echo waiting a bit
sleep 10000 & wait $!

Note that sleep 10000 will not be killed and will continue to run. If you want the background job to be killed when the script it killed, add that to the trap. This kind of cleanup is precisely what traps are for!

pid=
trap '[[ $pid ]] && kill $pid' EXIT
sleep 10000 & pid=$!
wait
pid=

Any bash internal command will be interrupted by a (non-ignored) incoming signal. If you don't like wait, there is also read. You can create an "infinite sleep" by reading from a pipe that will never give any data; the read will block forever, without needing an external sleep to keep track of:

trap 'echo "we get signal"; rm -f ~/fifo' EXIT
mkfifo ~/fifo || exit
chmod 400 ~/fifo
echo "sleeping"
read < ~/fifo

4. What is Ctrl-C doing?

You might think the first example in the previous section is not correct because when you try the script in your terminal and press Ctrl-C the message is clearly printed immediately. The difference between sending INT using kill -INT pid and Ctrl-C is that Ctrl-C sends INT to the process group, which means that sleep will also receive the signal, and not just the script.

To send INT to a process group you need to use the process group ID preceded by a dash:

 kill -INT -123 # will kill the process group with the ID 123

To find the process group ID of a process you can use: ps -p pid -o pgid (assuming your ps has this syntax). On Linux the process group is the PID of the process leader (typically the script), but this is not always the case (e.g. OpenBSD).

5. Special Note On SIGINT

If you choose to set up a handler for SIGINT (rather than using the EXIT trap), you should be aware that a process that exits in response to SIGINT should kill itself with SIGINT rather than simply exiting, to avoid causing problems for its caller. Thus:

trap 'rm -f "$tempfile"; trap - INT; kill -INT $$' INT

We can see the difference between a properly behaving process and a misbehaving process. On most operating systems, ping is an example of a misbehaving process. It traps SIGINT in order to display a summary at the end, before exiting. But it fails to kill itself with SIGINT, and so the calling shell does not know that it should abort as well. For example,

# Bash.  Linux ping syntax.
for i in {1..254}; do
  ping -c 2 192.168.1.$i
done

Here, if the user presses Ctrl-C during the loop, it will terminate the current ping command, but it will not terminate the loop. This is because Linux's ping command does not kill itself with SIGINT in order to communicate to the caller that the SIGINT was fatal. (Linux is not unique in this respect; I do not know of any operating system whose ping command exhibits the correct behavior.)

A properly behaving process such as sleep does not have this problem:

i=1
while [ $i -le 100 ]; do
  printf "%d " $i
  i=$((i+1))
  sleep 1
done
echo

If we press Ctrl-C during this loop, sleep will receive the SIGINT and die from it (sleep does not catch SIGINT). The shell sees that the sleep died from SIGINT. In the case of an interactive shell, this terminates the loop. In the case of a script, the whole script will exit, unless the script itself traps SIGINT.


CategoryShell

SignalTrap (last edited 2014-08-15 10:57:55 by 2-230-78-223)