The Bash Parser

This page informally describes parsing, expansion, and argument handling, but fuzzes some important distinctions that depend upon the type of command being handled. See Bash grammar and Parsing and execution on bash-hackers for a better look at this. It is important that you have a good understanding of how Bash reads your commands in and parses them into executable code, but even more important to understand the grammar of the language than implementation-specific parser details.

Parsing

There are several stages of parsing which occur in multiple passes: on the level of entire script files; within individual commands; and line-by-line. Code undergoes several intermediary internal representations throughout the evaluation process, some of which can't be analyzed through Bash's debugging facilities.

During Bash's initial intake of code -- as it reads source files or interactive input -- commands are parsed both line-by-line and command-by-command. Certain aspects of parsing are tied closely with lines. HereDocument parsing, some error handling behaviours, and some details of metacharacter parsing (e.g. extglobs) are tied to newlines. The extent to which Bash deals in "lines" is unclear and there is considerable variation across different shells. For example, some shells will accept !; cmd or ! !; cmd, whereas Bash requires a real newline and can't handle a semicolon in this case. Still other shells can't handle either type of null pipeline even with a newline.

In other respects, Bash parses commands in chunks whose scope encompasses roughly that of the current compound command. Most will notice this when they accidentally forget a closing fi, or semicolon before a closing curly-brace command group.

Aside from syntax errors, most of the time you don't need to think about this part of parsing. It's the actual evaluation of the commands (and intermediary parsing steps that happen at that time) that matters. Nevertheless, you may run across these considerations in some advanced cases when writing portability wrappers involving code that a particular shell implementation chokes on, or when Bash handles errors that are sensitive to newlines. Much of this behaviour is unspecified, some differs between Bash POSIX and normal mode, and a few are likely bugs or just coincidental behaviour.

Command splitting

Command expansion and evaluation

The remaining steps are processed for each individual command.

After these steps, the next command, or next line is processed. Once the end of the file is reached (end of the script or the interactive bash session is closed) bash stops and returns the exit code of the last command it has executed.

Graphical Example

For a simplified example of the process, see: http://stuff.lhunath.com/parser.png

Note that word-splitting (also WordSplitting) or field splitting is used incorrectly in this graphic and confused with argument splitting, which is performed before expansions and is based upon whitespace (except in traditional Bourne shells), rather than the value of IFS during field splitting, which occurs just before pathname expansion.

Common Mistakes

These steps might seem like common sense after looking at them closely, but they can often seem counter-intuitive for certain specific cases. As an example, let me enumerate a few cases where people have often made mistakes against the way they think bash will interpret their command:


CategoryShell