Diff for "BashGuide/CompoundCommands"

Differences between revisions 2 and 3

Structural Constructs

BASH offers numerous ways to combine simple commands to achieve our goals. We've already seen some of them in practice, but now let's look at a few more.

BASH has constructs called compound commands, which is a catch-all phrase covering several different concepts. We've already seen some of the compound commands BASH has to offer -- if statements, for loops, while loops, the [[ keyword, case and select. We won't repeat that information again here. Instead, we'll explore the other compound commands we haven't seen yet: subshells, command grouping, and arithmetic evaluation.

In addition, we'll look at functions and aliases, which aren't compound commands, but which work in a similar way.

Subshells

A SubShell is similar to a child process, except that more information is inherited. Subshells are created implicitly for each command in a pipeline. They are also created explicitly by using parentheses around a command:

    $ (cd /tmp || exit 1; date > timestamp)
    $ pwd
    /home/lhunath

When the subshell terminates, the cd command's effect is gone -- we're back where we started. Likewise, any variables that are set during the subshell are not remembered. You can think of subshells as temporary shells. See SubShell for more details.

Note that if the cd failed in that example, the exit 1 would have terminated the subshell, but not our interactive shell. As you can guess, this is quite useful in real scripts.

Command grouping

Commands may be grouped together using curly braces. This looks somewhat like a subshell, but it isn't. Command groups are executed in the same shell as everything else, rather than a new one.

Command groups can be used to run multiple commands and have a single redirection affect all of them:

    $ { echo "Starting at $(date)"; rsync -av . /backup; echo "Finishing at $(date)"; } > backup.log 2>&1

A subshell would have been overkill in this case, because we did not require a temporary environment. However, a subshell would also have worked.

Command groups are also useful to shorten certain common tasks:

    $ [[ -f $CONFIGFILE ]] || { echo "Config file $CONFIGFILE not found" >&2; exit 1; }

Compare that with the more formal version:

    $ if [[ ! -f $CONFIGFILE ]]; then
    > echo "Config file $CONFIGFILE not found" >&2
    > exit 1
    > fi

A subshell would not have worked here, because the exit 1 in a command group terminates the entire shell -- which is what we want here.

Command groups can also be used for setting variables in unusual cases:

    $ echo "cat
    > mouse
    > dog" > inputfile
    $ { read a; read b; read c; } < inputfile
    $ echo "$b"
    mouse

It would have been extremely difficult to read the second line from a file without a command group, which allowed multiple read commands to read from a single open FD without rewinding to the start each time. Contrast with this:

    $ read a < inputfile
    $ read b < inputfile
    $ read c < inputfile
    $ echo "$b"
    cat

That's not what we wanted at all!

If the command group is on a single line, as we've shown here, then there must be a semicolon before the closing }; otherwise, BASH would think } is an argument to the final command in the group. If the command group is spread across multiple lines, then the semicolon may be replaced by a newline:

    $ {
    >  echo "Starting at $(date)"
    >  rsync -av . /backup
    >  echo "Finishing at $(date)"
    > } > backup.log 2>&1

Arithmetic Evaluation

BASH has several different ways to say we want to do arithmetic instead of string operations. Let's look at them one by one.

The first way is the let command:

    $ unset a; a=4+5
    $ echo $a
    4+5
    $ let a=4+5
    $ echo $a
    9

You may use spaces, parentheses and so forth, if you quote the expression:

   $ let a='(5+2)*3'

For a full list of operators availabile, see help let or the manual.

Next, the actual arithmetic evaluation compound command syntax:

    $ ((a=(5+2)*3))

This is equivalent to let, but we can also use it as a command, for example in an if statement:

    $ if (($a == 21)); then echo 'Blackjack!'; fi

Operators such as ==, <, > and so on cause a comparison to be performed, inside an arithmetic evaluation. If the comparison is "true" (for example, 10 > 2 is true in arithmetic -- but not in strings!) then the compound command exits with status 0. If the comparison is false, it exits with status 1. This makes it suitable for testing things in a script.

Although not a compound command, an arithmetic substitution (or arithmetic expression) syntax is also available:

   $ echo "There are $(($rows * $columns)) cells"

Inside $((...)) is an arithmetic context, just like with ((...)), meaning we do arithmetic (multiplying things) instead of string manipulations (concatenating $rows, space, asterisk, space, $columns). $((...)) is also portable to the POSIX shell, while ((...)) is not.

Readers who are familiar with the C programming language might wish to know that ((...)) has many C-like features. Among them are the ternary operator:

   $ ((abs = (a >= 0) ? a : -a))

and the use of an integer value as a truth value:

   $ if ((flag)); then echo "uh oh, our flag is up"; fi

Note that we used variables inside ((...)) without prefixing them with $-signs. This is a special syntactic shortcut that BASH allows inside arithmetic evaluations and arithmetic expressions. We can also do that inside $((...)) in BASH, but not in the POSIX shell.

There is one final thing we must mention about ((flag)). Because the inside of ((...)) is C-like, a variable (or expression) that evaluates to zero will be considered false for the purposes of the arithmetic evaluation. Then, because the evaluation is false, it will exit with a status of 1. Likewise, if the expression inside ((...)) is non-zero, it will be considered true; and since the evaluation is true, it will exit with status 0. This is potentially very confusing, even to experts, so you should take some time to think about this. Nevertheless, when things are used the way they're intended, it makes sense in the end:

    $ flag=0      # no error
    $ while read line; do
    >   if [[ $line = *err* ]]; then flag=1; fi
    > done < inputfile
    $ if ((flag)); then echo "oh no"; fi

Functions

Functions are very nifty inside bash scripts. They are blocks of commands, much like normal scripts you might write, except they don't reside in separate files. However, they take arguments just like scripts -- and unlike scripts, they can affect variables inside your script, if you want them to. Take this for example:

    $ sum() {
    >   echo "$1 + $2 = $(($1 + $2))"
    > }

This will do absolutely nothing when run. This is because it has only been stored in memory, much like a variable, but it has not yet been called. To run the function, you would do this:

    $ sum 1 4
    1 + 4 = 5

Amazing! We now have a basic calculator, and potentially a more economic replacement for a five year-old.

A note on scope: if you choose to embed functions within script files, as many will find more convenient, then you need to understand that the parameters you pass to the script are not necessarily the parameters that are passed to the function. To wrap this function inside a script, we would write a file containing this:

sum() {
        echo "$1 + $2 = $(($1 + $2))"
}
sum $1 $2

As you can see, we passed the script's two parameters to the function within, but we could have passed anything we wanted (though, doing so in this situation would only confuse users trying to use the script).

Functions serve a few purposes in a script. The first is to isolate a block of code that performs a specific task, so that it doesn't clutter up other code. This helps you make things more readable, when done in moderation. (Having to jump all over a script to track down 7 functions to figure out what a single command does has the opposite effect, so make sure you do things that make sense.) The second is to allow a block of code to be reused with slightly different arguments.

Here's a slightly less silly example:

   1 #!/bin/bash
   2 open() {
   3     case "$1" in
   4         *.mp3|*.ogg|*.wav|*.flac) xmms "$1";;
   5         *.jpg|*.gif|*.png|*.wma)  display "$1";;
   6         *.avi|*.mpg|*.mp4|*.wmv)  mplayer "$1";;
   7     esac
   8 }
   9 for file; do
  10     open "$file"
  11 done

Here, we define a function named open. This function is a block of code that takes a single argument, and based on the pattern of that argument, it will either run xmms, display or mplayer with that argument. Then, a for loop iterates over all of the script's positional parameters. (Remember, for file is equivalent to for file in "$@" and both of them iterate over the full set of positional parameters.) The for loop calls the open function for each parameter.

As you may have observed, the function's parameters are different from the script's parameters.

Functions may also have local variables, declared with the local or declare keywords. This lets you do work without potentially overwriting important variables from the caller's namespace. For example,

    count() {
        local i
        for ((i=1; i<=$1; i++)); do echo $i; done
        echo 'Ah, ah, ah!'
    }
    for ((i=1; i<=3; i++)); do count $i; done

The local variable i is stored differently from the variable i in the outer script. This allows the two loops to operate without interfering with each other's counters.

Functions may also call themselves recursively, but we won't show that today. Maybe later!

Aliases

Aliases are superficially similar to functions at first glance, but upon closer examination, they have entirely different behavior.

Aliases do not work in scripts, at all. They only work in interactive shells.
Aliases cannot take arguments.
Aliases will not invoke themselves recursively.
Aliases cannot have local variables.

Aliases are essentially keyboard shortcuts intended to be used in .bashrc files to make your life easier. They usually look like this:

    $ alias ls='ls --color=auto'

BASH checks the first word of every simple command to see whether it's an alias, and if so, it does a simple text replacement. Thus, if you type

    $ ls /tmp

BASH acts as though you had typed

   $ ls --color=auto /tmp

If you wanted to duplicate this functionality with a function, it would look like this:

   $ unalias ls
   $ ls() { command ls --color=auto "$@"; }

As with a command group, we need a ; before the closing } of a function if we write it all in one line. The special built-in command command tells our function not to call itself recursively; instead, we want it to call the ls command that it would have called if there hadn't been a function by that name.

Aliases are useful as long as you don't try to make them work like functions. If you need complex behavior, use a function instead.

In the manual: ...

In the FAQ: ...

-  ⇤ ← Revision 2 as of 2008-11-22 14:09:20 → 
  Size: 1708
  Editor: localhost
  Comment: converted to 1.6 markup
+   ← Revision 3 as of 2009-02-16 17:00:21 → ⇥
  Size: 11884
  Editor: GreyCat
  Comment: fill in some stubs
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-Line 3:
+Line 2:
-== Structural Constructs (stub) ==

''Feel free to complete this section.''



<<Anchor(Compound_Commands)>>
=== Compound Commands (stub) ===

''Feel free to complete this section.''
+== Structural Constructs ==

[[BASH]] offers numerous ways to combine simple commands to achieve our goals.  We've already seen some of them in practice, but now let's look at a few more.

[[BASH]] has constructs called ''compound commands'', which is a catch-all phrase covering several different concepts.  We've already seen some of the compound commands [[BASH]] has to offer -- `if` statements, `for` loops, `while` loops, the `[[` keyword, `case` and `select`.  We won't repeat that information again here.  Instead, we'll explore the other compound commands we haven't seen yet: subshells, command grouping, and arithmetic evaluation.

In addition, we'll look at ''functions'' and ''aliases'', which aren't compound commands, but which work in a similar way.
-Line 17:
+Line 11:
-=== Subshells (stub) ===

''Feel free to complete this section.''
+=== Subshells ===

A SubShell is similar to a child process, except that more information is inherited.  Subshells are created implicitly for each command in a pipeline.  They are also created explicitly by using parentheses around a command:
{{{
    $ (cd /tmp || exit 1; date > timestamp)
    $ pwd
    /home/lhunath
}}}

When the subshell terminates, the `cd` command's effect is gone -- we're back where we started.  Likewise, any variables that are set during the subshell are not remembered.  You can think of subshells as temporary shells.  See SubShell for more details.

Note that if the `cd` failed in that example, the `exit 1` would have terminated the subshell, but ''not'' our interactive shell.  As you can guess, this is quite useful in real scripts.


<<Anchor(Command_grouping)>>
=== Command grouping ===

Commands may be grouped together using curly braces.  This looks somewhat like a subshell, but it isn't.  Command groups are executed in the same shell as everything else, rather than a new one.

Command groups can be used to run multiple commands and have a single redirection affect all of them:
{{{
    $ { echo "Starting at $(date)"; rsync -av . /backup; echo "Finishing at $(date)"; } > backup.log 2>&1
}}}

A subshell would have been overkill in this case, because we did not require a temporary environment.  However, a subshell would also have worked.

Command groups are also useful to shorten certain common tasks:
{{{
    $ [[ -f $CONFIGFILE ]] || { echo "Config file $CONFIGFILE not found" >&2; exit 1; }
}}}

Compare that with the more formal version:
{{{
    $ if [[ ! -f $CONFIGFILE ]]; then
    > echo "Config file $CONFIGFILE not found" >&2
    > exit 1
    > fi
}}}

A subshell would not have worked here, because the `exit 1` in a command group terminates the entire shell -- which is what we want here.

Command groups can also be used for setting variables in unusual cases:
{{{
    $ echo "cat
    > mouse
    > dog" > inputfile
    $ { read a; read b; read c; } < inputfile
    $ echo "$b"
    mouse
}}}

It would have been extremely difficult to read the second line from a file without a command group, which allowed multiple `read` commands to read from a single open FD without rewinding to the start each time.  Contrast with this:
{{{
    $ read a < inputfile
    $ read b < inputfile
    $ read c < inputfile
    $ echo "$b"
    cat
}}}
That's not what we wanted at all!

If the command group is on a single line, as we've shown here, then there ''must'' be a semicolon before the closing `}`; otherwise, [[BASH]] would think `}` is an argument to the final command in the group.  If the command group is spread across multiple lines, then the semicolon may be replaced by a newline:
{{{
    $ {
    >  echo "Starting at $(date)"
    >  rsync -av . /backup
    >  echo "Finishing at $(date)"
    > } > backup.log 2>&1
}}}


<<Anchor(Arithmetic_evaluation)>>
=== Arithmetic Evaluation ===

[[BASH]] has several different ways to say we want to do arithmetic instead of string operations.  Let's look at them one by one.

The first way is the `let` command:
{{{
    $ unset a; a=4+5
    $ echo $a
    4+5
    $ let a=4+5
    $ echo $a
    9
}}}

You may use spaces, parentheses and so forth, if you quote the expression:
{{{
   $ let a='(5+2)*3'
}}}

For a full list of operators availabile, see `help let` or the manual.

Next, the actual ''arithmetic evaluation'' compound command syntax:
{{{
    $ ((a=(5+2)*3))
}}}

This is equivalent to `let`, but we can also use it as a ''command'', for example in an `if` statement:
{{{
    $ if (($a == 21)); then echo 'Blackjack!'; fi
}}}

Operators such as `==`, `<`, `>` and so on cause a comparison to be performed, inside an arithmetic evaluation.  If the comparison is "true" (for example, `10 > 2` is true in arithmetic -- but not in strings!) then the compound command exits with status 0.  If the comparison is false, it exits with status 1.  This makes it suitable for testing things in a script.

Although not a compound command, an ''arithmetic substitution'' (or ''arithmetic expression'') syntax is also available:
{{{
   $ echo "There are $(($rows * $columns)) cells"
}}}

Inside `$((...))` is an ''arithmetic context'', just like with `((...))`, meaning we do arithmetic (multiplying things) instead of string manipulations (concatenating `$rows`, space, asterisk, space, `$columns`).  `$((...))` is also portable to the POSIX shell, while `((...))` is not.

Readers who are familiar with the C programming language might wish to know that `((...))` has many C-like features.  Among them are the ternary operator:
{{{
   $ ((abs = (a >= 0) ? a : -a))
}}}

and the use of an integer value as a truth value:
{{{
   $ if ((flag)); then echo "uh oh, our flag is up"; fi
}}}

Note that we used variables inside `((...))` without prefixing them with `$`-signs.  This is a special syntactic shortcut that [[BASH]] allows inside arithmetic evaluations and arithmetic expressions.  We can also do that inside `$((...))` in [[BASH]], but not in the POSIX shell.

There is one final thing we must mention about `((flag))`.  Because the inside of `((...))` is C-like, a variable (or expression) that evaluates to ''zero'' will be considered ''false'' for the purposes of the arithmetic evaluation.  Then, because the evaluation is false, it will ''exit'' with a status of 1.  Likewise, if the expression inside `((...))` is ''non-zero'', it will be considered ''true''; and since the evaluation is true, it will ''exit'' with status 0.  This is potentially ''very'' confusing, even to experts, so you should take some time to think about this.  Nevertheless, when things are used the way they're intended, it makes sense in the end:
{{{
    $ flag=0      # no error
    $ while read line; do
    >   if [[ $line = *err* ]]; then flag=1; fi
    > done < inputfile
    $ if ((flag)); then echo "oh no"; fi
}}}
-Line 26:
+Line 147:
-Functions are very nifty in bash scripts. They are effectively no different than normal lines of code you might write, except they only get called when you decide to call it. Take this for example:
+Functions are very nifty inside bash scripts. They are blocks of commands, much like normal scripts you might write, except they don't reside in separate files.  However, they take arguments just like scripts -- and unlike scripts, they can affect variables inside your script, if you want them to.  Take this for example:
-Line 30:
+Line 151:
-    >   echo $1 + $2 = $(($1 + $2))
+    >   echo "$1 + $2 = $(($1 + $2))"
-Line 33:
+Line 154:
-This will do absolutely nothing when run. This is because it has only been stored in memory, much like a variable, but it has never been called. To run the function, you would do this:
+This will do absolutely nothing when run. This is because it has only been stored in memory, much like a variable, but it has not yet been called. To run the function, you would do this:
-Line 41:
+Line 162:
-A note on scope: if you choose to embed functions within script files, as many will find more convenient, then you need to understand that the parameters you pass to the script are not necessarily the parameters that are passed to the function. To wrap this function inside a script, we would write a file contain this:
+A note on scope: if you choose to embed functions within script files, as many will find more convenient, then you need to understand that the parameters you pass to the script are not necessarily the parameters that are passed to the function. To wrap this function inside a script, we would write a file containing this:
-Line 46:
+Line 167:
-        echo $1 + $2 = $(($1 + $2))
+        echo "$1 + $2 = $(($1 + $2))"
-Line 51:
+Line 172:
+Functions serve a few purposes in a script.  The first is to isolate a block of code that performs a specific task, so that it doesn't clutter up other code.  This helps you make things more readable, when done in moderation.  (Having to jump all over a script to track down 7 functions to figure out what a single command does has the opposite effect, so make sure you do things that make sense.)  The second is to allow a block of code to be reused with slightly different arguments.

Here's a slightly less silly example:
{{{#!nl
#!/bin/bash
open() {
    case "$1" in
        *.mp3|*.ogg|*.wav|*.flac) xmms "$1";;
        *.jpg|*.gif|*.png|*.wma)  display "$1";;
        *.avi|*.mpg|*.mp4|*.wmv)  mplayer "$1";;
    esac
}
for file; do
    open "$file"
done
}}}

Here, we define a ''function'' named `open`.  This function is a block of code that takes a single argument, and based on the ''pattern'' of that argument, it will either run `xmms`, `display` or `mplayer` with that argument.  Then, a `for` loop iterates over all of the ''script's'' positional parameters.  (Remember, `for file` is equivalent to `for file in "$@"` and both of them iterate over the full set of positional parameters.)  The `for` loop calls the `open` function for each parameter.

As you may have observed, the function's parameters are different from the script's parameters.

Functions may also have ''local variables'', declared with the `local` or `declare` keywords.  This lets you do work without potentially overwriting important variables from the caller's namespace.  For example,
{{{
    count() {
        local i
        for ((i=1; i<=$1; i++)); do echo $i; done
        echo 'Ah, ah, ah!'
    }
    for ((i=1; i<=3; i++)); do count $i; done
}}}
The `local` variable `i` is stored differently from the variable `i` in the outer script.  This allows the two loops to operate without interfering with each other's counters.

Functions may also call themselves ''recursively'', but we won't show that today.  ''Maybe later!''
-Line 56:
+Line 211:
-=== Aliases (stub) ===

''Feel free to complete this section.''
+=== Aliases ===

Aliases are superficially similar to functions at first glance, but upon closer examination, they have entirely different behavior.
 * Aliases do not work in scripts, at all.  They only work in interactive shells.
 * Aliases cannot take arguments.
 * Aliases will not invoke themselves recursively.
 * Aliases cannot have local variables.

Aliases are essentially keyboard shortcuts intended to be used in `.bashrc` files to make your life easier.  They usually look like this:
{{{
    $ alias ls='ls --color=auto'
}}}

[[BASH]] checks the first word of every simple command to see whether it's an ''alias'', and if so, it does a simple text replacement.  Thus, if you type
{{{
    $ ls /tmp
}}}
[[BASH]] acts as though you had typed
{{{
   $ ls --color=auto /tmp
}}}

If you wanted to duplicate this functionality with a function, it would look like this:
{{{
   $ unalias ls
   $ ls() { command ls --color=auto "$@"; }
}}}

As with a ''command group'', we need a `;` before the closing `}` of a function if we write it all in one line.  The special built-in command `command` tells our function '''not''' to call itself recursively; instead, we want it to call the `ls` command that it would have called if there hadn't been a function by that name.

Aliases are useful as long as you don't try to make them work like functions.  If you need complex behavior, use a function instead.

--------
 . '''In the manual: ...'''
----
 . '''In the FAQ: ...'''
--------