Diff for "BashGuide/CompoundCommands"

Differences between revisions 20 and 21

Compound Commands

Contents

Compound Commands

BASH offers numerous ways to combine simple commands to achieve our goals. We've already seen some of them in practice, but now let's look at a few more.

BASH has constructs called compound commands, which is a catch-all phrase covering several different concepts. We've already seen some of the compound commands BASH has to offer -- if statements, for loops, while loops, the [[ keyword, case and select. We won't repeat that information again here. Instead, we'll explore the other compound commands we haven't seen yet: subshells, command grouping, and arithmetic evaluation.

In addition, we'll look at functions and aliases, which aren't compound commands, but which work in a similar way.

In the manual: Compound Commands

Subshells

A SubShell is similar to a child process, except that more information is inherited. Subshells are created implicitly for each command in a pipeline. They are also created explicitly by using parentheses around a command:

    $ (cd /tmp || exit 1; date > timestamp)
    $ pwd
    /home/lhunath

When the subshell terminates, the cd command's effect is gone -- we're back where we started. Likewise, any variables that are set during the subshell are not remembered. You can think of subshells as temporary shells. See SubShell for more details.

Note that if the cd failed in that example, the exit 1 would have terminated the subshell, but not our interactive shell. As you can guess, this is quite useful in real scripts.

In the manual: Command Grouping

Command grouping

We've already touched on this subject in Grouping Statements, though it pays to repeat it in the context of this chapter.

Commands may be grouped together using curly braces. Command groups allow a collection of commands to be considered as a whole with regards to redirection and control flow. All compound commands such as if statements and while loops do this as well, but command groups do only this. In that sense, command groups can be thought of as "null compound commands" in that they have no effect other than to group commands. They look a bit like subshells, with the difference being that command groups are executed in the same shell as everything else, rather than a new one. This is both faster and allows things like variable assignments to be visible outside of the command group.

All commands within a command group are within the scope of any redirections applied to a command group (or any compound command):

    $ { echo "Starting at $(date)"; rsync -av . /backup; echo "Finishing at $(date)"; } >backup.log 2>&1

The above example truncates and opens the file backup.log on stdout, then points stderr at where stdout is currently pointing (backup.log), then runs each command with those redirections applied. The file descriptors remain open until all commands within the command group complete before they are automatically closed. This means backup.log is only opened a single time, not opened and closed for each command. The next example demonstrates this better:

    $ echo "cat
    > mouse
    > dog" > inputfile
    $ for var in {a..c}; do read -r "$var"; done < inputfile
    $ echo "$b"
    mouse

Notice how we didn't actually use a command group here. As previously explained, "for" being a compound command behaves just like a command group. It would have been extremely difficult to read the second line from a file without allowing for multiple read commands to read from a single open FD without rewinding to the start each time. Contrast with this:

    $ read -r a < inputfile
    $ read -r b < inputfile
    $ read -r c < inputfile
    $ echo "$b"
    cat

That's not what we wanted at all!

Command groups are also useful to shorten certain common tasks:

    $ [[ -f $CONFIGFILE ]] || { echo "Config file $CONFIGFILE not found" >&2; exit 1; }

The logical "or" now executes the command group if $CONFIGFILE doesn't exist rather than just the first simple command. A subshell would not have worked here, because the exit 1 in a command group terminates the entire shell -- which is what we want here.

Compare that with a differently formatted version:

    $ if [[ ! -f $CONFIGFILE ]]; then
    > echo "Config file $CONFIGFILE not found" >&2
    > exit 1
    > fi

If the command group is on a single line, as we've shown here, then there must be a semicolon before the closing } ( { ...; last command; } ) otherwise, BASH would think } is an argument to the final command in the group. If the command group is spread across multiple lines, then the semicolon may be replaced by a newline:

    $ {
    >  echo "Starting at $(date)"
    >  rsync -av . /backup
    >  echo "Finishing at $(date)"
    > } > backup.log 2>&1

If redirections are used on a simple command, they only apply to the command itself, not parameter or other expansions. Command groups make all contents including expansions apply even in the case of a single simple command:

    $ { echo "$(cat)"; } <<<'hi'
    hi
    $ { "$(</dev/stdin)" <<<"$_"; } <<<'cat'
    hi

The second command (which you don't need to fully understand) would require killing the shell if the command group weren't present, since the shell would be reading from the tty to determine the command to execute. It also illustrates that the command still gets the redirect applied to it, while the expansion gets that of the command group.

In the manual: Command Grouping

Arithmetic Evaluation

BASH has several different ways to say we want to do arithmetic instead of string operations. Let's look at them one by one.

The first way is the let command:

    $ unset a; a=4+5
    $ echo $a
    4+5
    $ let a=4+5
    $ echo $a
    9

You may use spaces, parentheses and so forth, if you quote the expression:

   $ let a='(5+2)*3'

For a full list of operators availabile, see help let or the manual.

Next, the actual arithmetic evaluation compound command syntax:

    $ ((a=(5+2)*3))

This is equivalent to let, but we can also use it as a command, for example in an if statement:

    $ if (($a == 21)); then echo 'Blackjack!'; fi

Operators such as ==, <, > and so on cause a comparison to be performed, inside an arithmetic evaluation. If the comparison is "true" (for example, 10 > 2 is true in arithmetic -- but not in strings!) then the compound command exits with status 0. If the comparison is false, it exits with status 1. This makes it suitable for testing things in a script.

Although not a compound command, an arithmetic substitution (or arithmetic expression) syntax is also available:

   $ echo "There are $(($rows * $columns)) cells"

Inside $((...)) is an arithmetic context, just like with ((...)), meaning we do arithmetic (multiplying things) instead of string manipulations (concatenating $rows, space, asterisk, space, $columns). $((...)) is also portable to the POSIX shell, while ((...)) is not.

Readers who are familiar with the C programming language might wish to know that ((...)) has many C-like features. Among them are the ternary operator:

   $ ((abs = (a >= 0) ? a : -a))

and the use of an integer value as a truth value:

   $ if ((flag)); then echo "uh oh, our flag is up"; fi

Note that we used variables inside ((...)) without prefixing them with $-signs. This is a special syntactic shortcut that BASH allows inside arithmetic evaluations and arithmetic expressions.

There is one final thing we must mention about ((flag)). Because the inside of ((...)) is C-like, a variable (or expression) that evaluates to zero will be considered false for the purposes of the arithmetic evaluation. Then, because the evaluation is false, it will exit with a status of 1. Likewise, if the expression inside ((...)) is non-zero, it will be considered true; and since the evaluation is true, it will exit with status 0. This is potentially very confusing, even to experts, so you should take some time to think about this. Nevertheless, when things are used the way they're intended, it makes sense in the end:

    $ flag=0      # no error
    $ while read line; do
    >   if [[ $line = *err* ]]; then flag=1; fi
    > done < inputfile
    $ if ((flag)); then echo "oh no"; fi

In the manual: Arithmetic Expansion, Shell Arithmetic

Functions

Functions are very nifty inside Bash scripts. They are blocks of commands, much like normal scripts you might write, except they don't reside in separate files, and they don't cause a separate process to be executed. However, they take arguments just like scripts -- and unlike scripts, they can affect variables inside your script, if you want them to. Take this for example:

    $ sum() {
    >   echo "$1 + $2 = $(($1 + $2))"
    > }

This will do absolutely nothing when run. This is because it has only been stored in memory, much like a variable, but it has not yet been called. To run the function, you would do this:

    $ sum 1 4
    1 + 4 = 5

Amazing! We now have a basic calculator, and potentially a more economic replacement for a five year-old.

A note on scope: if you choose to embed functions within script files, as many will find more convenient, then you need to understand that the parameters you pass to the script are not necessarily the parameters that are passed to the function. To wrap this function inside a script, we would write a file containing this:

   1 #!/bin/bash
   2 sum() {
   3         echo "$1 + $2 = $(($1 + $2))"
   4 }
   5 sum "$1" "$2"

As you can see, we passed the script's two parameters to the function within, but we could have passed anything we wanted (though, doing so in this situation would only confuse users trying to use the script).

Functions serve a few purposes in a script. The first is to isolate a block of code that performs a specific task, so that it doesn't clutter up other code. This helps you make things more readable, when done in moderation. (Having to jump all over a script to track down 7 functions to figure out what a single command does has the opposite effect, so make sure you do things that make sense.) The second is to allow a block of code to be reused with slightly different arguments.

Here's a slightly less silly example:

   1 #!/bin/bash
   2 open() {
   3     case "$1" in
   4         *.mp3|*.ogg|*.wav|*.flac|*.wma) xmms "$1";;
   5         *.jpg|*.gif|*.png|*.bmp)        display "$1";;
   6         *.avi|*.mpg|*.mp4|*.wmv)        mplayer "$1";;
   7     esac
   8 }
   9 for file; do
  10     open "$file"
  11 done

Here, we define a function named open. This function is a block of code that takes a single argument, and based on the pattern of that argument, it will either run xmms, display or mplayer with that argument. Then, a for loop iterates over all of the script's positional parameters. (Remember, for file is equivalent to for file in "$@" and both of them iterate over the full set of positional parameters.) The for loop calls the open function for each parameter.

As you may have observed, the function's parameters are different from the script's parameters.

Functions may also have local variables, declared with the local or declare keywords. This lets you do work without potentially overwriting important variables from the caller's namespace. For example,

    count() {
        local i
        for ((i=1; i<=$1; i++)); do echo $i; done
        echo 'Ah, ah, ah!'
    }
    for ((i=1; i<=3; i++)); do count $i; done

The local variable i inside the function is stored differently from the variable i in the outer script. This allows the two loops to operate without interfering with each other's counters.

Functions may also call themselves recursively, but we won't show that today. Maybe later!

In the manual: Shell Functions

Aliases

Aliases are superficially similar to functions at first glance, but upon closer examination, they have entirely different behavior.

Aliases do not work in scripts, at all. They only work in interactive shells.
Aliases cannot take arguments.
Aliases will not invoke themselves recursively.
Aliases cannot have local variables.

Aliases are essentially keyboard shortcuts intended to be used in .bashrc files to make your life easier. They usually look like this:

    $ alias ls='ls --color=auto'

BASH checks the first word of every simple command to see whether it's an alias, and if so, it does a simple text replacement. Thus, if you type

    $ ls /tmp

BASH acts as though you had typed

    $ ls --color=auto /tmp

If you wanted to duplicate this functionality with a function, it would look like this:

    $ unalias ls
    $ ls() { command ls --color=auto "$@"; }

As with a command group, we need a ; before the closing } of a function if we write it all in one line. The special built-in command command tells our function not to call itself recursively; instead, we want it to call the ls command that it would have called if there hadn't been a function by that name.

Aliases are useful as long as you don't try to make them work like functions. If you need complex behavior, use a function instead.

Destroying Constructs

To remove a function or variable from your current shell environment use the unset command.

    $ unset myfunction

To remove an alias, use the unalias command.

    $ unalias rm

In the manual: Bourne Shell Builtins

In the FAQ: ...

<- Input and Output | Sourcing ->

-  ⇤ ← Revision 20 as of 2012-03-04 23:03:16 → 
  Size: 14999
  Editor: ormaaj
  Comment: Improved the command groups section.
+   ← Revision 21 as of 2012-07-14 13:59:40 → ⇥
  Size: 15031
  Editor: bzq-79-177-223-84
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 82:
-If the command group is on a single line, as we've shown here, then there ''must'' be a semicolon before the closing `}`; otherwise, [[BASH]] would think `}` is an argument to the final command in the group.  If the command group is spread across multiple lines, then the semicolon may be replaced by a newline:
+If the command group is on a single line, as we've shown here, then there ''must'' be a semicolon before the closing `}` ( { ...; last command''';''' } ) otherwise, [[BASH]] would think `}` is an argument to the final command in the group.  If the command group is spread across multiple lines, then the semicolon may be replaced by a newline: