Introduction

All the information here is presented without any warranty or guarantee of accuracy. Use it at your own risk. When in doubt, please consult the man pages or the GNU info pages as the authoritative references.

["BASH"] is a BourneShell compatible shell, which adds many new features to its ancestor. Most of them are available in the 'KornShell', too.

TableOfContents


Anchor(about)

About This Guide

This guide aims to become a point of reference for people interested in learning to work with ["BASH"]. It aspires to teach its readers good practice techniques in developing scripts for the ["BASH"] interpreter and educate them about the internal operation of ["BASH"].

This guide is targetted at beginning users. It assumes no basic knowledge, but rather expects you to have enough common sense to put two and two together. If something is unclear to you, you should report this so that it may be clarified in this document for future readers.

You are invited to contribute to the development of this document by extending it or correcting invalid or incomplete information.

Anchor(definition)

A Definition

["BASH"] is an acronym for Bourne Again Shell. It is based on the Bourne shell and is mostly compatible with its features.

Shells are applications that provide users with the ability to give commands to their operating system interactively, or to allow them to execute batch processes quickly. In no way are they required for execution of processes; they are merely a layer between system function calls and the user.



Anchor(using)

Using Bash

Most users that think of ["BASH"] think of it as a prompt and a command line. That is ["BASH"] in interactive mode. ["BASH"] can also run in non-interactive mode through scripts. We can use scripts to automate certain logic. Scripts are basically lists of commands that you can type on the command line. When such a script is executed, all these commands are executed sequentially, one after another.

We'll start with the basics in an interactive shell. Once you're familiar with those, you can put them together in scripts.



Anchor(basics)

The Basics

Anchor(commands)

Commands And Arguments

["BASH"] reads commands from its input (which is either a terminal or a file). In interactive mode, its input is your terminal. These commands can be aliases, functions, builtin commands, or executable applications.

Each command can be followed by arguments. It is very important that you understand how this works exactly. If you don't grasp these concepts well, the quality of your code will degrade significantly and you will introduce very dangerous bugs. So, pay close attention in the next few chapters.

    $ ls
    a  b  c

ls is a command that lists files in the current directory.

    $ mkdir d
    $ cd d
    $ ls

mkdir is a command that creates a new directory. We specified the argument d to that command. This way, the application mkdir is instructed to create a directory called d. After that, we use the application cd to change the current directory to d. ls shows us that the current directory (which is now d) is empty, since it doesn't display any filenames.


    $ type rm
    rm is hashed (/bin/rm)
    $ type cd
    cd is a shell builtin



Anchor(splitting)

Commandline Argument Splitting

Commands in ["BASH"] can take multiple arguments. These arguments are used to tell the command exactly what it's supposed to do. In ["BASH"], you separate these arguments by whitespace (spaces, tabs and newlines).

    $ ls
    $ touch a b c
    $ ls
    a  b  c

touch is an application that changes the 'Last Modified'-time of a certain file to the current time. If the filename that it's given does not exist yet, it simply creates that file, as a new and empty file. In this example, we passed three arguments. touch creates a file for each argument. ls shows us that three files have been created.

    $ rm *
    $ ls
    $ touch a   b c
    $ ls
    a  b  c

rm is an application that removes all the files that it was given. * is a glob. It basically means all files in the current directory. You will read more about this later on.

Now, did you notice that there are several spaces between a and b, and only one between b and c? Also, notice that the files that were created by touch are no different than the first time. You now know that the amount of whitespace between arguments does not matter. This is important to know. For example:

    $ echo This is a test.
    This is a test.
    $ echo This    is    a    test.
    This is a test.

In this case, we provide the echo command with four arguments. 'This', 'is', 'a' and 'test.'. echo takes these arguments, and prints them out one by one with a space in between. In the second case, the exact same thing happens. The extra spaces make no difference. To protect the whitespace properly, we need to pass the sentence as one single argument. We can do this by using quotes:

    $ echo "This    is    a    test."
    This    is    a    test.

Quotes group everything together and pass it as a single argument. This argument is 'This is a test.', properly spaced. echo prints this single argument out just like it always does.

Be very careful to avoid the following:

    $ ls
    The secret voice in your head.mp3  secret
    $ rm The secret voice in your head.mp3
    rm: cannot remove `The': No such file or directory
    rm: cannot remove `voice': No such file or directory
    rm: cannot remove `in': No such file or directory
    rm: cannot remove `your': No such file or directory
    rm: cannot remove `head.mp3': No such file or directory
    $ ls
    The secret voice in your head.mp3

You need to make sure you quote filenames properly. If you don't you'll end up deleting the wrong things! rm takes filenames as arguments. If you do not quote filenames with spaces, rm thinks that each argument is another file. Since ["BASH"] splits your arguments at the spaces, rm will try to remove each word.

Please have a good look at http://bash-hackers.org/wiki/doku.php?id=syntax:words if all this isn't very clear to you yet.




Anchor(patterns)

Patterns

Patterns are strings that are used to match a whole range of strings. They have a special format depending on the pattern dialect which describes the kinds of strings that they match. Regular Expression patterns can even be used to grab certain pieces out of the strings they match.

On the command line you will mostly use Glob Patterns. They are a fairly straight-forward form of patterns that can easily be used to match a range of files.

Since version 3.0, ["BASH"] also supports Regular Expression patterns. These will be useful mainly in scripts to test user input or parse data.

=== Glob Patterns ==

Globs are a very important concept in ["BASH"]; if only for their incredible convenience. Properly understanding globs will benefit you in many ways. Globs are basically patterns that can be used to match filenames or other strings.

Globs are composed of normal characters and meta characters. Meta characters are characters that have a special meaning. These are the basic meta characters:

Here's an example of how we can use glob patterns to expand to filenames:

    $ ls
    a  abc  b  c
    $ echo *
    a abc b c
    $ echo a*
    a abc

["BASH"] sees the glob, for example a*. It expands this glob, by looking in the current directory and matching it against all files there. Any filenames that match the glob, are enumerated and replaced by the glob. As a result, the statement echo a* is replaced by the statement echo a abc, and is then executed.

["BASH"] will always make sure that whitespace and special characters are escaped properly when expanding the glob. For example:

    $ touch "a b.txt"
    $ ls
    a b.txt
    $ rm *
    $ ls

Here, rm * is expanded into rm a\ b.txt. This makes sure that the string a b.txt is passed as a single argument to rm, since it represents a single file. It is important to understand that using globs to enumerate files is nearly always a better idea than using ls for that purpose. Here's an example with some more complex syntax which we will cover later on, but it will illustrate the problem very well:

    $ ls
    a b.txt
    $ for file in `ls`; do rm "$file"; done
    rm: cannot remove `a': No such file or directory
    rm: cannot remove `b.txt': No such file or directory
    $ for file in *; do rm "$file"; done
    $ ls

Here we use the for command to go through the output of the ls command. The ls command results in a string a b.txt. The for command splits that string into arguments over which it iterates. As a result, for iterates over a and b.txt. Naturally, this is not what we want. The glob however expands in the proper form. It results in the string a\ b.txt, which for takes as a single argument.

["BASH"] also supports a feature called Extended Globs. These globs are more powerful in nature. This feature is turned off by default, but can be turned on with the shopt command, which is used to toggle shell options:

    $ shopt -s extglob

The list inside the parentheses is a list of globs separated by the | character. Here's an example:

    $ ls
    names.txt  tokyo.jpg  california.bmp
    $ echo !(*jpg|*bmp)
    names.txt

Our glob now expands to anything that does not match the *jpg or the *bmp pattern. Only the text file passes for that, so it is expanded.

Then, there is Brace Expansion. Brace Expansion technically does not fit in the category of Globs, but it is similar. Globs only expand to actual filenames, where brace expansion will expand to any permutation of the pattern. Here's how they work:

    $ echo th{e,a}n
    then than
    $ echo {/home/*,/root}/.*profile
    /home/axxo/.bash_profile /home/lhunath/.profile /root/.bash_profile /root/.profile
    $ echo {1..9}
    1 2 3 4 5 6 7 8 9
    $ echo {0,1}{0..9}
    00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19





Regular Expressions

Regular Expressions (regex) are similar to Glob Patterns but cannot be used for filename matching in ["BASH"]. Since 3.0 ["BASH"] supports the =~ operator to the [[ built-in. This operator matches the string that comes before it against the pattern that follows it. When the string matches the pattern, [[ returns with an exit code of 0. If the string does not match the pattern, an exit code of 1 is returned. In case the pattern's syntax is invalid, [[ will abort the operation and return an exit code of 2.

["BASH"] uses the Extended Regular Expression (ERE) dialect. I will not teach you about regex in this guide, but if you are interested in this concept; please read up on [http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_04 Extended Regular Expressions] or Google for a tutorial.

Regular Expression patterns that use capturing groups will have their captured strings assigned to the BASH_REMATCH variable for later retrieval.

Let's illustrate how regex can be used in ["BASH"]:

    $ if [[ $LANG =~ (..)_(..) ]]
    > then echo "You live in ${BASH_REMATCH[2]} and speak ${BASH_REMATCH[1]}."
    > else echo "Your locale was not recougnised"
    > fi

Be aware that regex parsing in ["BASH"] has changed between releases 3.1 and 3.2. Before 3.2 it was safe to wrap your regex pattern in quotes but this has changed in 3.2. Since then, regex should always be unquoted. You should protect any special characters by escaping it using a backslash. Since the way regex is used in 3.2 is also valid in 3.1 we highly recommend you just never quote your regex.

    $ [[ "My sentence" =~ My\ sentence ]]

Be careful to escape any characters that the shell could misinterprete; such as whitespace, dollar signs followed by text, braces, etc.

Anchor(characters)

Special Characters

There are several special characters in ["BASH"] that have a non-literal meaning. When we use these characters, ["BASH"] evaluates these characters and their meaning, but usually does not pass them on to the underlying commands.

Here are a few of those special characters, and what they do:

Some examples:

    $ echo "I am $USER"
    I am lhunath
    $ echo 'I am $USER'
    I am $USER
    $ # boo
    $ echo An open\ \ \ space
    An open   space
    $ echo "My computer is $(hostname)"
    My computer is Lyndir
    $ echo boo > file
    $ echo $(( 5 + 5 ))
    10
    $ (( 5 > 0 )) && echo "Five is bigger than zero."
    Five is bigger than zero.



Anchor(parameters)

Parameters

Parameters should be seen as a sort of named space in memory where you can store your data. Generally speaking, they will store string data, but can also be used to store integers or arrays.

Special Parameters and Variables

Let's get your vocabulary straight before we get into the real deal. There are Parameters and Variables. Variables are actually just a kind of parameters. Parameters that are denoted by a name. Parameters that aren't variables are called Special Parameters I'm sure you'll understand things better with a few examples:

    $ # Some parameters that aren't variables:
    $ echo My shell is $0, and was started with these options: $-
    My shell is -bash, and was started with these options: himB
    $ # Some parameters that ARE variables:
    $ echo I am $USER, and I live at $HOME.
    I am lhunath, and I live at /home/lhunath.

Please note: Unlike PHP/Perl/... parameters do NOT start with a $-sign. The $-sign you see in the examples merely causes the parameter that follows it to be expanded. Expansion basically means that the shell replaces it by its content. As such, USER is the parameter (variable), that contains your username. $USER will be replaced with its content; which in my case, is lhunath.

I think you've got the drift now. Here's a summary of most Special Parameters:

And here are some examples of Variables that the shell initializes for you:

Of course, you aren't restricted to only these variables. Feel free to define your own:

    $ country=Canada
    $ echo "I am $USER and I currently live in $country."
    I am lhunath and I currently live in Canada.

Notice what we did to assign the value Canada to the variable country. Remember that you are NOT allowed to have any spaces before or after that equals sign!

    $ language = PHP
    -bash: language: command not found
    $ language=PHP
    $ echo "I'm far too used to $language."
    I'm far too used to PHP.

Remember that ["BASH"] is not Perl or PHP. You need to be very well aware of how expansion works to avoid big trouble. If you don't, you'll end up creating very dangerous situations in your scripts, especially when making this mistake with rm:

    $ ls
    no secret  secret
    $ file='no secret'
    $ rm $file
    rm: cannot remove `no': No such file or directory

Imagine we have two files, no secret and secret. The first contains nothing useful, but the second contains the secret that will save the world from impending doom. Unthoughtful as you are, you forgot to quote your parameter expansion of file. ["BASH"] expands the parameter and the result is rm no secret. ["BASH"] splits the arguments up by their whitespace as it normally does, and rm is passed two arguments; 'no' and 'secret'. As a result, it fails to find the file no and it deletes the file secret. The secret is lost!





Variable Types

Although ["BASH"] is not a typed language, it does have a few different types of variables. These types define the kind of content they have. They are stored in the variable's attributes.

Attributes are settings for a variable. They define the way the variable will behave. Here are the attributes you can assign to a variable:

Arrays are basically lists of strings. They are very convenient for their ability to store different elements without relying on a delimiter. That way, you don't need to worry about the fact that this delimiter could possibly end up being part of an element's content and thus split that element up:

    $ files='one:two:three:four'

Here we try to use a string to contain a list of files. To do that, we need to rely on a delimiter to keep the files apart. We choose ':'. As a result, we cannot add any files to the list that have a ':' in their filename. That's why arrays are so convenient:

    $ files=( 'one' 'two' 'three' 'four' '5: five' )

As shown above, you can assign arrays using (...). In this case, elements are separated by whitespace; but you can protect an element's whitespace with quotes. If you want to use some form of expansion to assign values to an array, rather than literal, be aware that ["BASH"] will obviously need to perform some form of word splitting to figure out which parts of your expansion should be put in which elements of the array:

    $ files='one:two:three:four'
    $ IFS=:
    $ files=( $files )

For this word splitting, ["BASH"] looks at the first character in IFS again. There, it finds the delimiter to use for splitting the result of the expansion up in elements.

Defining variables as integers has the advantage that you can leave out some syntax when trying to assign or modify them:

    $ a=5; a+=2; echo $a; unset a
    52
    $ a=5; let a+=2; echo $a; unset a
    7
    $ declare -i a=5; a+=2; echo $a; unset a
    7
    $ a=5+2; echo $a; unset a
    5+2
    $ declare -i a=5+2; echo $a; unset a
    7

Parameter Expansion

Parameter Expansion is the term that refers to any operation that causes a parameter to be expanded (replaced by content). In its most basic appearance, the parameter expansion of a parameter is achieved by prefixing that parameter with a $ sign. In certain situations, additional curly brackets around the parameter's name are required:

    $ echo "'$USER', '$USERs', '${USER}s'"
    'lhunath', '', 'lhunaths'

This example illustrates what basic parameter expansions look like. The second PE results in an empty string. That's because the parameter USERs is empty. We did not intend to have the s be part of the parameter name. Since there's no way for ["BASH"] to determine whether you want the s appended to the name of the parameter or its value you need to use curly brackets to mark the beginning and end of the parameter name. That's what we do in the third PE in our example above.

Parameter Expansion can also be used to modify the string that will be expanded. These operations are terribly convenient:

    $ for file in *.JPG *.jpeg
    > do mv "$file" "${file%.*}.jpg"
    > done

The code above can be used to rename all JPEG files with a JPG or a jpeg extension to have a normal jpg extension. The PE (${file%.*}) cuts off everything from the end until it finds a period (.). Then, in the same quotes, a new extension is appended to the expansion result.

Here's a summary of most PEs that are available:

You will learn them all through experience. They will come in handy far more often than you think they might. Here's a few examples to kickstart you:

    $ file="$HOME/.secrets/007"; \
    > echo "File location: $file"; \
    > echo "Filename: ${file##*/}"; \
    > echo "Directory of file: ${file%/*}"; \
    > echo "Non-secret file: ${file/secrets/not_secret}"; \
    > echo; \
    > echo "Other file location: ${other:-There is no other file}"; \
    > echo "Using file if there is no other file: ${other:=$file}"; \
    > echo "Other filename: ${other##*/}"; \
    > echo "Other file location length: ${#other}"
    File location: /home/lhunath/.secrets/007
    Filename: 007
    Directory of file: /home/lhunath/.secrets
    Non-secret file: /home/lhunath/.not_secret/007
    
    Other file location: There is no other file
    Using file if there is no other file: /home/lhunath/.secrets/007
    Other filename: 007
    Other file location length: 26

Remember the difference between ${v#p} and ${v##p}. The doubling of the # character means metacharacters will become greedy. The same counts for % and /:

    $ version=1.5.9; echo "MAJOR: ${version%%.*}, MINOR: ${version#*.}."
    MAJOR: 1, MINOR: 5.9.
    $ echo "Dash: ${version/./-}, Dashes: ${version//./-}."
    Dash: 1-5.9, Dashes: 1-5-9.

Note: You cannot nest PEs together. If you need to execute multiple PEs on a parameter, you will need to use multiple statements:

    $ file=$HOME/image.jpg; file=${file##*/}; echo "${file%.*}"
    image





Anchor(conditionals)

Tests and Conditionals

Sequential execution of applications is one thing, but to achieve a sort of logic in your scripts or your command line one-liners, you'll need variables and conditionals. Conditionals are used to determine the execution flow of a script.

Anchor(exitcode)

Exit Status

Every application results in an exit code whenever it terminates. This exit code is used by whatever application started it to evaluate whether everything went OK. This exit code is like a return value from functions. It's an integer between 0 and 255 (inclusive). Convention dictates that we use 0 to denote success, and any other number to denote failure of some sort. The specific number is entirely application-specific, and is used to hint as to what exactly went wrong.

For example, the ping command sends ICMP packets over the network to a certain host. That host normally responds to this packet by sending the exact same one right back. This way, we can check whether the remote host can receive our packets. ping has a range of exit codes which can tell us what went wrong, if anything did:

From the ping manual:

The parameter ? shows us the exit code of the last foreground process that terminated. Let's play around a little with ping to see it's exit codes:

    $ ping God
    ping: unknown host God
    $ echo $?
    2

    $ ping -c 1 -W 1 1.1.1.1
    PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
    
    --- 1.1.1.1 ping statistics ---
    1 packets transmitted, 0 received, 100% packet loss, time 0ms
    $ echo $?
    1


    rm file || { echo "Could not delete file!"; exit 1; }



Anchor(operators)

Control Operators

Now that we know what exit codes are, and that an exit code of '0' means the command's execution was successful, we'll learn to use this information. The easiest way of performing a certain action depending on the success of a previous command is through the use of 'control operators'. These operators are && and ||, which respectively represent a logical AND and a logical OR. These operators are used inbetween two commands, and they are used to control whether the second command should be executed depending on the success of the first.

Let's put that theory in practice:

    $ mkdir d && cd d

This simple example has two commands, mkdir d and cd d. You could easily just use a semi-colon there to separate both commands and execute them sequentially; but we want something more. In the above example, ["BASH"] will execute mkdir d, then && will check the result of the mkdir application as it finishes. If the mkdir application resulted in a success (exit code 0), then && will execute the next command, cd d. If mkdir d failed, and returned a non-0 exit code, && will skip the next command, and we will stay in the current directory.

Another example:

    $ rm /etc/some_file.conf || echo "I couldn't remove the file!"

    rm: cannot remove `/etc/some_file.conf': No such file or directory
    I couldn't remove the file!

|| is much like &&, but it does the exact opposite. It only executes the next command if the first failed. As such, the message is only echoed if the rm command was unsuccessful.

You can make a sequence with these operators, but you have to be very careful when you do. Remember what exit code the operator is really going to be checking against! Here's an example that might cause confusion:

    $ false && true || echo "Riddle, riddle?"
    Riddle, riddle?

    $ true && false || echo "Riddle, riddle?"
    Riddle, riddle?

true is obviously always going to be successful. false is obviously always going to be unsuccessful. Can you guess why the echo statement is executed in both occasions?

The key to understanding how to sequence these operators properly is by evaluating exit codes from left to right.

In the first example, false is unsuccessful, so && does not execute the next command (which is true), but the next || gets a shot too. || still sees that the last exit code was that from false, and || executes the next command when the previous was unsuccessful. As a result, the echo statement is executed.

The same for the second statement again. true is successful, so the && executes the next statement. That is false; the last exit code now becomes unsuccessful. After that, || is evaluated, which sees the unsuccessful exit code from false and executes the echo statement.

It's all easy with trues and falses; but how about real commands?

    $ rm file && touch file || echo "File not found!"

All seems well with this piece of code, and when you test it, I'm sure you'll see that it actually does what it's supposed to. It tries to delete a file, and if it succeeds, it creates it again as a new and empty file; if something goes wrong we get the error message. What's the catch?

Perhaps you guessed, perhaps not, but here's a hint: Imagine we're in a directory where we don't have permission to create a file? It won't stop us from deleting the file if the file happens to be ours. rm will succeed in deleting our file, but touch will fail to create it anew because of permission issues. As a result, we get a strange error message saying that the file wasn't found while we were actually trying to create it. What's up with that?




Anchor(if)

If-statements

if is an application that executes the command that it receives as argument, and checks that command's exit code to see whether its execution was successful. Depending on that exit code, if executes a specific block of code.

    $ if true
    > then echo "It was true."
    > else echo "It was false!"
    > fi

    It was true.

Here you see the basic outline of an if-statement. We start by calling if with the argument true. true is a sort of built-in application, running it is like running an application that always ends successfully. if runs that application, and once the application's done, it checks the exit code. Since true always exits successfully, if continues to the then-block, and executes its code. Should the true application have failed somehow, and returned an unsuccessful exit code, the if statement would have skipped the then code, and executed the else code block instead.

There are commands that can help us a lot in doing conditional checks. They are [ (also named test) and [[. [ is a normal application that reads its arguments and does some checks with them. [[ is much like [, however, it is not an application. It's a built-in and it offers far more versatillity. Let's get practical:

    $ if [ a = b ]
    > then echo "a is the same as b."
    > else echo "a is not the same as b."
    > fi

    a is not the same as b.

if executes the command [ with the arguments 'a', '=', 'b' and ']'. [ uses these arguments to determine what must be checked. It then checks whether the string 'a' is identical to the string 'b', and if this is the case, it will exit successfully. However, since we know this is not the case, [ will not exit successfully (it's exit code will be 1). if sees that [ terminated unsuccessfully and executes the code in the else block.

Now, to see why [[ is so much more interesting and trustworthy than [, let us highlight some possible problems with [:

    $ if [ my dad = my dog ]
    > then echo "I have a problem."
    > fi

    -bash: [: too many arguments

Can you guess what caused the problem? BR [ was executed with the arguments 'my', 'dad', '=', 'my', 'dog' and ']'. [ doesn't understand what test it's supposed to execute, because it expects the second argument to be the operator. In our case, the operator is the third argument. Yet another reason why quotes are so terribly important. Whenever we type whitespace in bash that belongs together with the words before or after it, we need to quote the whole string:

    $ if [ 'my dad' = 'my dog' ]
    > then echo "I have a problem."
    > fi

This time, [ sees an operator (=) in the second argument and it can continue with its work. Now, this may be easy to see and avoid, but it gets just a little trickier when we put the strings in variables, rather than literally in the statement:

    $ dad='my dad'; dog='my dog'
    $ if [ $dad = $dog ]
    > then echo "I have a problem."
    > fi

    -bash: [: too many arguments

How did we mess up this time? BR Here's a hint: ["BASH"] takes our if-statement and expands all the parameters in it. The result is if [ my dad = my dog ]. Boom, game over.

Here's how it's supposed to look like:

    $ if [ "$dad" = "$dog" ]
    > then echo "I have a problem."
    > fi

To help us out a little, ["BASH"] introduced a new style of conditional test. Original as the ["BASH"] authors are, they called it [[. [[ was loaded with several very interesting features which are missing from [. One of them helps us in dealing with parameter expansions:

    $ if [[ $dad = $dog ]]
    > then echo "I have a problem."
    > fi
    $ if [[ I want $dad = I want $dog ]]
    > then echo "I want too much."
    > fi

    -bash: conditional binary operator expected
    -bash: syntax error near `want'

This time, $dad and $dog didn't need to be quoted. Since [[ isn't an application (while [ is), but a built-in, it has special magical powers. It parses its arguments before they are expanded by bash and does the expansion itself; taking the result as a single argument, even if that result contains whitespace. However, be aware that simple strings still have to be quoted properly. [[ can't know whether your literal whitespace in the statement is intentional or not; so it splits it up just like ["BASH"] normally would. Let's fix our last example:

    $ if [[ "I want $dad" = "I want $dog" ]]
    > then echo "I want too much."
    > fi

Now that you've got a decent understanding of quoting issues that may arise, let's have a look at some of the other features that [ and [[ were blessed with:

You want some examples? Sure:

    $ test -e /etc/X11/xorg.conf && echo "Your Xorg is configured!"
    Your Xorg is configured!
    $ test -n "$HOME" && echo "Your homedir is set!"
    Your homedir is set!
    $ [[ boar != bear ]] && echo "Boars aren't bears!"
    Boars aren't bears!
    $ [[ boar != b?ar ]] && echo "Boars don't look like bears!"
    $ [[ $DISPLAY ]] && echo "Your DISPLAY variable is not empty, you probably have Xorg running."
    Your DISPLAY variable is not empty, you probably have Xorg running.
    $ [[ ! $DISPLAY ]] && echo "Your DISPLAY variable is not not empty, you probably don't have Xorg running."





Anchor(loops)

Conditional Loops

You've learned how to code some basic logic flow for your scripts. It's important that you understand a thing or two about keeping scripts healthy first.

["BASH"] scripts, much like any other kind of scripts, should never be overrated. Although they have great potential once you fully understand its features; they aren't always the right tool for the job. At the same time, when you make scripts, you should remember to keep them light, both in length and in complexity. Very long and/or very complex scripts are most often also very bad scripts. Those that aren't yet soon will be; because they are always very difficult to maintain and adapt/extend.

A technique that we can use to try and keep code length and complexity down is loops. There are two kinds of loops. Using the correct kind of loop correctly will help you keep your scripts readable and healthy.

["BASH"] supports while loops and for loops. The for loops can appear in three different forms. Here's a summary:

Let's put that in practice; here are some examples to illustrate the differences but also the similarities between the loops:

    $ while true
    > do echo "Infinite loop!
    > done

    $ (( i=10 )); while (( i > 0 ))
    > do echo "$i empty cans of beer."
    > (( i-- ))
    > done

    $ for (( i=10; i > 0; i-- ))
    > do echo "$i empty cans of beer."
    > done

    $ for i in {10..0}
    > do echo "$i empty cans of beer."
    > done

The last three loops achieve exactly the same result; just in a different syntax. You'll encounter this many times in your shell scripting experience. There will nearly always be multiple approaches to solving a problem. The test of your skill soon won't be about solving a problem as much as about how best to solve it. You need to learn to pick the best angle of approach for the job. Usually, the main factors to keep into account will be the simplicity and flexibility of the resulting code. My personal favorite is the latter of the examples. In the example I used Brace Expansion to generate the words; but there are other ways, too.

Let's take a closer look at that last example, because although it looks the easier of both fors, it can often be the trickiest too; if you don't know exactly how it works.

As I mentioned before; it takes one word from a list of words and puts each in the variable, one at a time, then loops through the code with it. The tricky part is how ["BASH"] decides what the words are. Let me explain myself by expanding the braces from that previous example:

    $ for i in 10 9 8 7 6 5 4 3 2 1 0
    > do echo "$i empty cans of beer."
    > done

["BASH"] takes the characters between in and the end of the statement, at splits them up into words. You shouldn't confuse the splitting that happens here with the splitting that happens with Commandline Arguments; even though they look exactly the same at first sight. Commandline arguments are split at spaces, tabs and newlines; while the splitting in this for statement happens at spaces by default. This default behaviour can be changed. The way for determines what delimiter to use for the splitting is by looking into the IFS variable; and taking the first character there. IFS is an acronym for Internal Field Separator; and by default it contains a space, a tab and a newline. Since the space is the first character there, for uses it to split up the words in our sequence; and feeds each word to the variable i; one at a time.

As a result; be VERY careful not to make the following mistake:

    $ ls
    The best song in the world.mp3
    $ for file in $(ls *.mp3)
    > do rm "$file"
    > done

    rm: cannot remove `The': No such file or directory
    rm: cannot remove `best': No such file or directory
    rm: cannot remove `song': No such file or directory
    rm: cannot remove `in': No such file or directory
    rm: cannot remove `the': No such file or directory
    rm: cannot remove `world.mp3': No such file or directory

You should already know to quote the $file in the rm statement; but what's going wrong here? Right. ["BASH"] expands the command substitution ($(ls *.mp3)), replaces it by its output, and as a result executes for file in The best song in the world.mp3. ["BASH"] splits that up in words by using spaces and tries to rm each word. Boom, you are dead.

You want to quote it, you say? Let's add another song:

    $ ls
    The best song in the world.mp3  The worst song in the world.mp3
    $ for file in "$(ls *.mp3)"
    > do rm "$file"
    > done
    rm: cannot remove `The best song in the world.mp3  The worst song in the world.mp3': No such file or directory

Quotes will indeed protect the whitespace in your filenames; but it will do more than that. The quotes will protect all the whitespace from the output of ls. There is no way ["BASH"] can know which parts of the output of ls represent filenames; it's not psychic. The output of ls is a simple string, and ["BASH"] treats it as that for lack of better. The for puts the whole quoted output in i and runs the rm command with it. Damn, dead again.

So what do we do? As suggested earlier; globs are your best friend:

    $ for file in *.mp3
    > do rm "$file"
    > done

This time, ["BASH"] does know which are filenames, and it does know what the filenames are and as such it can split them up nicely. The result of expanding the glob is this: for file in "The best song in the world.mp3" "The worst song in the world.mp3". Problem resolved.

Let's talk about changing that delimiter. Say, you've got yourself a nice cooking recipe, and you want to write a script that tells you how to use it. Sure, let's get right at it:

    $ recipe='2 c. all purpose flour
    > 6 tsp. baking powder
    > 2 eggs
    > 2 c. milk
    > 1/3 c. oil'

    $ for ingredient in $recipe
    > do echo "Take $ingredient; mix well."
    > done

Can you guess what the result will look like? I recommend you run the code if you can't and ponder the reason first. It will help you understand things.

Yes, as explained earlier, for splits its stuff up in words by using the delimiter from IFS.

To read the recipe correctly, we need to split it up by newlines instead of by spaces. Here's how we do that:

    $ recipe='2 c. all purpose flour
    > 6 tsp. baking powder
    > 2 eggs
    > 2 c. milk
    > 1/3 c. oil'
    $ IFS=$'\n'

    $ for ingredient in $recipe
    > do echo "Take $ingredient; mix well."
    > done

    Take 2 c. all purpose flour; mix well.
    Take 6 tsp. baking powder; mix well.
    Take 2 eggs; mix well.
    Take 2 c. milk; mix well.
    Take 1/3 c. oil; mix well.

Beautiful.

Note: This delimiter is only used when the words consist of an expansion. Not when they're literal. Literal words are always split at spaces:

    $ PATH=/bin:/usr/bin
    $ IFS=:

    $ for i in $PATH
    > do echo "$i"
    > done
    /bin
    /usr/bin

    $ for i in $PATH:/usr/local/bin
    > do echo "$i"
    > done
    /bin
    /usr/bin
    /usr/local/bin

    $ for i in /bin:/usr/bin:/usr/local/bin
    > do echo "$i"
    > done
    /bin:/usr/bin:/usr/local/bin

Lets focus a little more on the while loop. It promises even more simplicity than this for loop; so long as you don't need any for specific features. The while loop is very interesting for its capacity of executing commands and basing the loop's progress on the result of them. Here are a few examples of how while loops are very often used:

    $ # The sweet machine; hand out sweets for a cute price.
    $ while read -p $'The sweet machine.\nInsert 20c and enter your name: ' name
    > do echo "The machine spits out three lollipops at $name."
    > done

    $ # Check your email every five minutes.
    $ while sleep 5m
    > do kmail --check
    > done

    $ # Wait for a host to come back online.
    $ while ! ping -c 1 -W 1 "$host"
    > do echo "$host is still unavailable."
    > done; echo -e "$host is available again!\a"




Anchor(io)

Input And Output

This basic principle of computer science applies just as well to applications started through ["BASH"]. ["BASH"] makes it fairly easy to play around with the input and output of commands, which gives us great flexibility and increadible opportunities for automation.

Anchor(fds)

File Descriptors

Input and output from and to processes always occurs via so called File Descriptors (in short: FDs). FDs are kind of like pointers to sources of data. When something reads from or writes to that FD, the data is being read from or written to the FD's data source. FDs can point to regular files, but they can also point to more abstract data sources, like the input and output source of a process.

By default, every new process has three FDs. They are referred to by the names Standard Input, Standard Output and Standard Error. In short, they are respectively called stdin, stdout and stderr. The Standard Input is where the characters you type on your keyboard usually come from. The Standard Output is where the program sends most of its normal information to so that the user can see it, and the Standard Error is where the program sends its error messages to. Be aware that GUI applications work in the same way; but the actual GUI doesn't work via these FDs. GUI applications can still read and write from and to the standard FDs, but they usually don't. Usually, they do all the user interaction via that GUI; making it hard to control for ["BASH"]. As a result, we'll stick to simple console applications. Those we can easily feed data on the "Standard Input" and read data from on its "Standard Output" and "Standard Error".

Let's make these definitions a little more concrete. Here's a demonstration of how "Standard Input" and "Standard Output" work:

    $ read -p "What is your name? " name; echo "Good day, $name.  Would you like some tea?"
    What is your name? lhunath
    Good day, lhunath.  Would you like some tea?

read is a command that reads information from stdin and stores it in a variable. We specified name to be that variable. Once read has read a line of information from stdin, it finished and lets echo display a message. echo uses stdout to send its output to. stdin is connected to your terminal's input device; which is probably going to be your keyboard. stdout is connected to your terminal's output device; which I assume is a computer monitor. As a result; you can type in your name and are then greeted with a friendly message on your monitor, offering you a cup of tea.

So what is stderr? Let's demonstrate:

    $ rm secrets
    rm: cannot remove `secrets': No such file or directory

Unless if you had a file called secrets in your current directory; that rm command will fail and show an error message explaining what went wrong. Error messages like these are by convention displayed on stderr. stderr is also connected to your terminal's output device, just like stdout. As a result, error messages display on your monitor just like the messages on stdout. However, this separation makes it easy to keep errors separated from the application's normal messages. Some people like to use wrappers to make all the output on stderr red, so that they can see the error messages more clearly. This is not generally advisable, but it is a simple example of the many options this separation provides us with.


    echo "Uh oh.  Something went really bad.." >&2


Anchor(redirection)

Redirection

The most basic form of input/output manipulation in ["BASH"] is Redirection. Redirection is used to change the data source or destination of an application's FDs. That way, you can send the application's output to a file instead of the terminal, or have the application read from a file instead of from the keyboard.

Redirection, too, comes in different shapes. There's File Redirection, File Descriptor manipulation, Heredocs and Herestrings.



File Redirection

File Redirection is probably the most basic form of redirection. I'll start with this so you can grasp the concept of redirection well.

    $ echo "The story of William Tell.
    >
    > It was a cold december night.  Too cold to write." > story
    $ cat story
    The story of William Tell.
    
    It was a cold december night.  Too cold to write.

As a result; the echo command will not send its output to the terminal, but the > story operation changes the destination of the stdout FD so that it now points to a file called story. Be aware that before the echo command is executed, ["BASH"] first checks to see whether that file story actually exists. If it doesn't, it is created as an empty file, so that the FD can be pointed to it. This behaviour can be toggled with Shell Options (see later).

We then use the application cat to print out the contents of that file. cat is an application that reads the contents of all the files you pass it as arguments. It then outputs each file one after another on stdout. In essence, it concatenates the contents of all the files you pass it as arguments.

Warning: Far too many code examples and shell tutorials on the Internet tell you to use cat whenever you need to read the contents of a file. This is highly ill-adviced! cat only serves well to contatenate contents of multiple files together, or as a quick tool on the shell prompt to see what's inside a file. You should NOT use cat to read from files in your scripts. There will almost always be far better ways to do this. Please keep this warning in mind. Useless usage of cat will merely result in an extra process to create, and often results in poorer read speed because cat cannot determine the context of what it's reading and the purpose for that data.

When we use cat without passing any kind of arguments, it obviously doesn't know what files to read the content for. In this case, cat will just read from stdin instead of from a file (much like read). Since stdin is normally not a regular file, starting cat without any arguments will seem to do nothing:

    $ cat

It doesn't even give you back your shell prompt! What's going on? cat is still reading from stdin, which is your keyboard. Anything you type now will be sent to cat. As soon as you hit the Enter key, cat will do what it normally does; it will display what it reads on stdout, just the same way as when it displayed our story on stdout:

    $ cat
    test?
    test?

Why does it say test? twice now? Well, as you type, your terminal shows you all the characters that you send to stdin before sending them there. That results in the first test? that you see. As soon as you hit Enter, cat has read a line from stdin, and shows it on stdout, which is also your terminal; hence, resulting in the second line: test?. You can press Ctrl+D to send cat the End of File character. That'll cause cat to think the file stdin has closed. It will stop reading from it and return you to your prompt. Let's use file redirection to attach a file to stdin, so that stdin is no longer reading from our keyboard, but instead, now reads from the file:

    $ cat < story
    The story of William Tell.
    
    It was a cold december night.  Too cold to write.

The result of this is exactly the same as the result from our previous cat story; except this time, the way it works is a little different. In our first example, cat opened an FD to the file story and read its contents through that FD. In this recent example, cat simply reads from stdin, just like it did when it was reading from our keyboard. However, this time, the < story operation has modified stdin so that its data source is the file story rather than our keyboard.

Let's summarize:

Redirection operators can take a number. That number denotes the FD that it changes. If the number is not present, the > operator uses FD 1 by default, because that is the number for stdout. < uses FD 0 by default, because that is the number for stdin. The number for the stderr FD is 2. So, let's try sending the output of stderr to a file:

    $ for homedir in /home/*
    > do rm "$homedir/secret"
    > done 2> errors

In this example, we're looping over each file in /home. We then try to delete the file secret in each of them. Some homedirs may not have a secret. As a result, the rm operation will fail and send an error message on stderr.

You may have noticed that our redirection operator isn't on rm, but it's on that done thing. Why is that? Well, this way, the redirection applies to all output to stderr made inside the whole loop.

Let's see what the result of our loop was?

    $ cat errors
    rm: cannot remove `/home/axxo/secret': No such file or directory
    rm: cannot remove `/home/lhunath/secret': No such file or directory

Two error messages in our error log file. Two people that didn't have a secret file in their home directory.

If you're writing a script, and you expect that running a certain command may fail on occasion, but don't want the script's user to be bothered by the possible error messages that command may produce, you can silence an FD. Silencing it is as easy as normal File Redirection. We're just going to send all output to that FD into the system's black hole:

    $ for homedir in /home/*
    > do rm "$homedir/secret"
    > done 2> /dev/null

The file /dev/null is always empty, no matter what you write or read from it. As such, when we write our error messages to it, they just disappear. The /dev/null file remains as empty as ever before. That's because it's not a normal file, it's a virtual device.

There is one last thing you should learn about File Redirection. It's interesting that you can make error log files like this to keep your error messages; but as I mentioned before, ["BASH"] makes sure that the file exists before trying to redirect to it. ["BASH"] also makes sure the file is empty before redirecting to it. As a result, each time we run our loop to delete secret files, our log file will be truncated empty before we fill it up again with new error messages. What if we'd like to keep a record of any error messages generated by our loop? What if we don't want that file to be truncated each time we start our loop? The solution is achieved by doubling the redirection operator. > becomes >>. >> will not empty a file, it will just append new data to the end of it!

    $ for homedir in /home/*
    > do rm "$homedir/secret"
    > done 2>> errors

Hooray!




File Descriptor Manipulation

Now that you know how to manipulate process input and output by sending it to and reading it from files; let's make this a little more interesting still.

It's possible to change the source and desination of FDs to point to or from files, that you know. It's also possible to copy one FD to another. Let's prepare a simple testbed:

    $ echo "I am a proud sentence." > file

We've made a file called file, and written a proud sentence into it. It's time I introduce a new application to you. Its name is grep, and it's increadibly powerful. grep is that one thing that you need more than anything else in your household. It basically takes a search string as its first argument and one or more files as extra arguments. Just like cat, grep also uses stdin if you don't specify any files as extra arguments. grep reads the files (or stdin if none were provided) and searches for the search string you gave it. Most versions of grep even support a -r switch, which makes it take directories as well as files as extra arguments, and then searches all the files and directories in those directories that you gave it. Here's an example of how grep can work:

    $ ls house/
    house/drawer  house/closet  house/dustbin  house/sofa
    $ grep -r socks house/
    house/sofa:socks

In this silly example we have a directory called house with several pieces of furniture in it as files. If we're looking for our socks in each of those files, we send grep to search the directory house/. grep will search everything in there, open each file and look through its contents. In our example, grep finds socks in the file house/sofa; presumably tucked away under a pillow. You want a more realistic example? Sure:

    $ grep "$HOSTNAME" /etc/*
    /etc/hosts:127.0.0.1       localhost Lyndir

Here we instruct grep to search for whatever $HOSTNAME expands to in whatever /etc/* expands to. It finds my hostname, which is Lyndir in the file /etc/hosts, and shows me the line in that file that contains the search string.

OK, now that you understand grep, let's continue with our File Descriptor Manipulation. Remeber that we created a file called file, and wrote a proud sentence to it? Let's use grep to find where that proud sentence is now:

    $ grep proud *
    file:I am a proud sentence.

Good, grep found our sentence in file. It writes the result of its operation to stdout which is shown on our terminal. Now let's see if we can make grep send an error message, too:

    $ grep proud file 'not a file'
    file:I am a proud sentence.
    grep: not a file: No such file or directory

This time, we instruct grep to search for the string proud in the files 'file' and 'not a file'. file exists, and the sentence is in there, so grep happily writes the result to stdout. It moves on to the next file to scan, which is 'not a file'. grep can't open this file to read its content, because it doesn't exist. As a result, grep emits an error message on stderr which is still connected to our terminal.

Now, how would you go about silencing this grep statement completely? We'd like to send all the output that appears on the terminal to a file instead; let's call it proud.log:

    $ grep proud file 'not a file' > proud.log 2> proud.log

Does that look about right? We first use > to send stdout to proud.log, and then use 2> to send stderr to proud.log as well. Almost, but not quite. If you run this command and then look in proud.log, you'll see there's only an error message, not the output from stdout. We've created a very bad condition here. After stdout has written its output to the log file, the error message needs to be written to the log file. The stderr redirection truncates the log file and writes its own information there. Things would go even more wrong if after that new information arrives on stdout. stdout's write operation might cancel an active write operation from stderr to continue writing its own information.

We need to prevent having two FDs working on the same destination or source. We can do this by dublicating FDs:

    $ grep proud file 'not a file' > proud.log 2>&1

You need to remember to always read file redirections from right to left. This is the order in which ["BASH"] assigns and processes them. First, stdout is changed so that it points to our proud.log. Then, we use the >& syntax to dublicate FD 1 and put this dublicate in FD 2's stead. If this is hard for you to grasp, you could read this as: stdout becomes proud.log and stderr becomes stdout (which is proud.log). As a result, stdout obviously writes its information to proud.log, but stderr writes its information to whatever stdout was when the >& redirector was used; which was proud.log as well. In this case, the handle that writes to proud.log is the same for both stdout and stderr, and no collisions occur.

Be careful not to confuse the order:

    $ grep proud file 'not a file' 2>&1 > proud.log

This would read as stderr becomes stdout (which is the terminal) and stdout becomes proud.log. As a result, stdout's messages will be logged, but the error messages will still go to the terminal. Oops.

Note: BR For compatibility reasons with other shells, ["BASH"] also makes yet another form of redirection available to you. The &> redirection operator is actually just a shorter version of what we did here; redirecting both stdout and stderr to a file:'

    $ grep proud file 'not a file' &> proud.log

This is the same as > proud.log 2>&1, just a bit shorter. You pick.

TODO: Moving FDs and Opening FDs RW.




Heredocs And Herestrings

Files aren't all that. They're boring, really. Strings are so much more interesting. They're not permanent like files on a hard disk, but they're easy to work with, easy to make an easy to manipulate.

Heredocs and Herestrings allow you to perform Redirection as you would with files, just by using strings instead. Let's try it out!

    $ grep proud <<END
    > I am a proud sentence.
    > END
    I am a proud sentence.

This is a Heredoc. Heredocs aren't really useful unless if you're trying to embody long strings of several lines inside your scripts; which is very bad practice. The way they work, is by adding the <<STRING operator at the end of a command. That'll tell that command's stdin that it has to start reading input from the script (or the command line, if you're not in a script). The input of the Heredoc stops as soon as you repeat whatever string you used to add to the end of the <<. In the example above, I used the string END; but it can really be anything (so long as you quote it if it has whitespace).

Let's check out the very similar but more interesting Herestrings:

    $ grep proud <<<"I am a proud sentence"
    I am a proud sentence.

This time, stdin reads its information straight from the string you put after the <<< operator. This is very convenient to send data that's in variables into processes:

    $ grep proud <<<"$USER sits proudly on his throne in $HOSTNAME."
    lhunath sits proudly on his throne in Lyndir.


  • Good Practice: BR Heredocs are usually a bad idea because scripts should contain logic, not data. If you have a document that your script needs; you should ship it in a separate file along with your script. Herestrings, however, come in handy quite often; especially for sending variable content to processes like grep or sed instead of files.