File Descriptors

A File Descriptor (FD) is a number which refers to an open file. Each process has its own private set of FDs, but FDs are inherited by child processes from the parent process.

Every process should inherit three open FDs from its parent: 0 ("standard input"), open for reading; and 1 ("standard output") and 2 ("standard error"), open for writing. A process that is started without one or more of these may behave unpredictably. (So never close stderr. Always redirect to /dev/null instead.)

Processes may open additional FDs as needed (up to whatever limit the operating system imposes). In most languages, when you open a new file, you are given back the FD number that the operating system selects (or a library manages the FD number for you and hides the details). In shell scripts, however, the paradigm is different: you select the FD number first, and then open the file using that FD. This means you, as the script writer, must keep track of which FDs you are using for each task.

Shells use redirection to work with FDs. For already-existing FDs, output can be sent to them, or input can be read from them, by using file descriptor duplication syntax:

echo "unexpected error: $foo" 1>&2

while read -r line 0<&3; do ...

The Bash Guide and Bash FAQ 55 give an introductory explanation, so we'll remain concise here. In the echo example, we know that echo normally writes to stdout (FD 1), so we override FD 1 to point to where FD 2 is pointing. Thus, echo will be tricked into writing to our stderr (FD 2). In the read example, we know that read normally reads from stdin (FD 0), but we override FD 0 to point to wherever our FD 3 is pointing; and so read will pull its input from there, instead of from our FD 0. These redirections are transient, only applying to the single commands where they appear.

In order to create new FDs, we must open files for them to point to. Typically we want the FD to be available within our shell, so that it can be reused or passed to children as needed. To open a file in a shell script, we use the exec command:

exec 3> myfifo

As we said earlier, you must know at the time you're writing the script which FD number you want to use for each task. In the example above, we open FD 3 for output to a file (or something) named myfifo. Presumably we will use this open FD later, to write information to the file.

Input redirection works the same way:

exec 4< /etc/passwd

An FD can also be opened for both reading and writing:

# Bash
exec 3<> /dev/tcp/www.google.com/80

This is necessary for most socket I/O applications (send a message to a service, and receive a response from it, over a single socket).

Once an FD has been opened, it can be used for reading and writing using the redirection techniques described earlier on this page. Here is a complete HTTP request in bash:

   1 #!/usr/bin/env bash
   2 exec 3<> /dev/tcp/www.google.com/80 || exit 1
   3 printf 'HEAD / HTTP/1.1\nHost: www.google.com\nConnection: close\n\n' >&3
   4 cat <&3

Here, we open FD 3 to point to a TCP socket; then we write an HTTP request to the socket; then we read the response from the socket.

Working with NamedPipes generally involves similar techniques -- creating the FIFO first, then setting up a reader and a writer. If the writer will be a script, and wishes to write more than once to the FIFO without triggering an EOF condition for the reader, then the script will open an FD:

exec 3> myfifo
echo "something" >&3
...
echo "something else" >&3

When we are finished with an FD, we can close it. We need to know the number, and whether it was opened for reading or writing.

exec 3>&-   # Close FD 3 which was open for writing
exec 4<&-   # Close FD 4 which was open for reading

All our FDs will be closed when we exit, but it is a good practice to close them ourselves anyway. (If we wish to close the FD to free the resources before exiting, then we must also do it explicitly.)


CategoryShell