5404
Comment: "..."'...'
|
12682
Add escaping as a form of quoting.
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
Quoting in shell programming is extremely important. There are multiple types of quotes, and you must know how and when to use each type. | In many languages, quotes are mainly used to specify that the enclosed text is to be interpreted as a string datatype, but in shell programming, almost everything is a string, so quoting in the shell has many different effects and purposes. There are multiple types of quotes which primarily enable different ways of interpreting their contents. Unfortunately, the rules are often hard for beginners to learn and their behavior varies by context with several special cases and exceptions to remember. Also unfortunately, '''quoting in shell programming is ''extremely'' important'''. It's something no one can avoid learning. Improper shell quoting is one of the most common sources of scripting bugs and security issues. Fortunately, it's possible to get it right most of the time by following a few guidelines, but do not guess about quoting. When in doubt, test, and read the man page for how quotes are to be interpreted in a given context. |
Line 3: | Line 3: |
= Standard quoting = | <<TableOfContents>> |
Line 5: | Line 5: |
A shell command is parsed by the shell into words, using whitespace (regardless of the [[IFS]] variable) and other shell metacharacters. The first function of quoting is to permit words to contain these metacharacters. | == Types of quoting == * '''Single quotes:''' (apostrophes) prevent all special interpretation of the characters between them. Everything between single quotes is literal and expands to a single word. The only thing which cannot be single-quoted is a single quote itself. * '''Double quotes:''' are the most important and complex type of quoting in shell programming. Double quotes permit the shell to parse their contents for those types of expansions that are prefixed with a ''$'' sign - [[ParameterExpansion|parameter expansions]], [[ArithmeticExpression|arithmetic expansions]] and [[CommandSubstitution|command substitutions]], while treating all other content as literal (except for the legacy backticks ({{{`}}}) described below or history expansion (!)). Most importantly, [[WordSplitting|word splitting]] and [[glob|filename expansion]] are not applied to anything within double quotes, but they also protect their contents from [[BraceExpansion|brace expansion]], [[http://wiki.bash-hackers.org/syntax/expansion/tilde|tilde expansion]], [[ProcessSubstitution|process substitution]], and interpretation as [[http://wiki.bash-hackers.org/syntax/redirection|redirection]] syntax. Like single quotes, double quotes also allow for the inclusion of ''metacharacters'' and ''control operators'' without backslash escaping. Double quotes also have important effects on [[ParameterExpansion|parameter expansions]], where they can influence the number of words when expanding composite variables (arrays and positional parameters), and by not applying word splitting and pathname expansion to the results. * '''$'...' and $"...":''' Bash introduces two additional forms of quoting -- single and double quotes preceded by a ''$'' sign. Of these,`$'...'` is the most common, and acts just like single quotes except that backslash-escaped combinations are expanded as specified by the ANSI C standard. This allows a convenient way to embed nonprintable characters into strings or pass them as arguments. The second form is `$"..."` and is used for [[BashFAQ/098|localization support]]. Neither of these are specified or mentioned by POSIX, but at least the `$'...'` variety is widely supported by Bash, Ksh93, and relatives. * '''Escaping:''' The escape character in Bash and POSIX sh is the backslash. In addition to some of the usual uses for escape sequences in other languages, an unquoted escape in an ordinary argument evaluation context in Bash serves mostly the same function as single-quoting. A backslash followed by a glob character, metacharacter, redirect operator, and some other characters with special meaning causes them to become literal. Escapes have several other functions in other contexts, such as within pattern matching contexts and arguments corresponding to printf's `%b` format, including some undocumented functions, like changing the special-builtin lookup order in non-POSIX mode. Most commonly, escapes used as a form of quoting are a shortcut alternative to single-quoting a single character. e.g. you might write `\*` instead of `'*'`, or combine multiple words into a single word using `foo\ bar` instead of `'foo bar'` (or more accurately, `foo' 'bar`). '''TODO: write more about escaping'''. ''Note'': Although backticks ({{{`}}}) are a type of quotes linguistically, they don't actually "quote" anything in bash. Where quoting in bash is used to make data (partly) literal, backticks do something entirely different. They're the old Bourne syntax for [[CommandSubstitution|command substitution]]. They ''mostly'' do the same thing as `$(...)`, but there are a [[BashFAQ/082|handful of weird exceptions]]. Either way, it doesn't matter that much: Their use is deprecated in favor of `$(...)`, and you really shouldn't use these things anymore in modern or maintained bash code. Take away from this that if you see them, they do NOT void the need for quoting (just like `"$(command)"`, you should double-quote them: {{{"`command`"}}}), and you should probably just replace them with `$()`. == Effects of Quoting == === Preserve unescaped metacharacters === A shell command is parsed by the shell into words, using whitespace (regardless of the [[IFS]] variable) and other shell metacharacters. '''The first function of quoting is to permit words to contain these metacharacters.''' |
Line 10: | Line 23: |
Line 13: | Line 25: |
The quotes are not actually passed along to the command. They are removed by the shell. In the example above, the `echo` command sees ''only'' the `&`, not the quotes. | The quotes are not actually passed along to the command. They are removed by the shell (this process is cleverly called "quote removal"). In the example above, the `echo` command sees ''only'' the `&`, not the quotes. |
Line 15: | Line 27: |
'''Single quotes''' (apostrophes) prevent all interpretation of the characters between them. '''Double quotes''' permit parameter expansions, [[ArithmeticExpression|arithmetic expansions]] and [[CommandSubstitution|command substitutions]] to occur inside them. They do not allow [[glob|filename expansions]], brace expansions, [[ProcessSubstitution|process substitutions]], tilde expansion, etc. In short, any substitution that starts with `$` is allowed, and also {{{`}}} for legacy compatibility. The second purpose of quoting is to prevent WordSplitting and [[glob|globbing]]. The value of a double-quoted substitution does ''not'' undergo any further processing. (An unquoted substitution ''does''.) |
=== Prevent field splitting and ignore glob pattern characters === '''The second purpose of quoting is to prevent [[WordSplitting|word splitting]] and [[glob|globbing]].''' The result of a single-quoted or double-quoted substitution does ''not'' undergo any further processing. (The result of an unquoted substitution ''does''.) |
Line 24: | Line 33: |
In this example, the double quotes protect each parameter (variable) from undergoing word splitting or globbing should it happen to contain whitespace or wildcard characters (`*` or `?` or `[...]`). Without the quotes, a `filename` like `hot stuff.mp3` would be split into two words, and each word would be passed to the `cp` command as a separate argument. Or, a `filename` that contains `*` with whitespace around it would produce one word for every file in the current directory. That is not what we want. |
In this example, the double quotes protect the value of each parameter (variable) from undergoing word splitting or globbing should it happen to contain whitespace or wildcard characters (`*` or `?` or `[...]`). Without the quotes, a `filename` like `hot stuff.mp3` would be split into two words, and each word would be passed to the `cp` command as a separate argument. Or, a `filename` that contains `*` with whitespace around it would produce one word for every file in the current directory. That is not what we want. |
Line 29: | Line 37: |
When in doubt, quote your parameter expansions. It's rare that you would need one to be unquoted, and if that's the case, you'll know it. | When in doubt, always double-quote your parameter expansions. |
Line 36: | Line 44: |
Line 39: | Line 46: |
Assorted examples, to show how things ''should'' be done. Some of these examples use bash/ksh syntax that won't work in strict POSIX shells. | === Expand argument lists === Double-quoting `$@` or `${array[@]}` has a special meaning. `"$@"` expands to a list of words, with each positional parameter's value being one word. Likewise, `"${array[@]}"` expands to a list words, one per array element. When dealing with the positional parameters or with the contents of an array as a list of words, ''always'' use the double-quoted syntax. Double-quoting `$*` or `${array[*]}` results in ''one word'' which is the concatenation of all the positional parameters (or array elements) with the first character of [[IFS]] between them. This is similar to the `join` function in some other languages, although the fact that you can only have a single join character can sometimes be a crippling limitation. == Examples == Here are some assorted examples, to show how things ''should'' be done. Some of these examples use bash/ksh syntax that won't work in strict POSIX shells. Proper iteration over the positional parameters using a quoted `"$@"`. Never use an unquoted `$@` or `$*`. |
Line 42: | Line 57: |
for file in "$@" | for file in "$@"; do ... done |
Line 44: | Line 61: |
As above, except an array. | |
Line 46: | Line 64: |
for element in "${array[@]}" | for element in "${array[@]}"; do ... |
Line 48: | Line 67: |
Proper iteration over indexes | |
Line 50: | Line 70: |
for index in ${!array[*]} # We omit quotes here because the indices are a list of numbers, # which must be word-split in order to be processed. Numbers will # never undergo glob expansion either, so we can skip "set -f". # We could also write it this way, and it would still work: for index in "${!array[@]}" # The second version has the advantage of continuing to work if # the array is an associative array rather than a normal array. |
# bash 3.0 and higher for index in "${!array[@]}"; do ... |
Line 61: | Line 74: |
All of the usual expansions apply to text within the parentheses of a compound array assignment including ''word splitting'' and ''pathname expansion'', and must be quoted and escaped in the same way as though they were to be passed as arguments to a command. | |
Line 63: | Line 77: |
# bash or ksh93 | |
Line 66: | Line 81: |
First, a single quoted string followed by an unquoted, escaped single quote, followed by a single-quoted string. Second, a non-POSIX equivalent using `$'...'` | |
Line 69: | Line 85: |
echo $'Don\'t talk!' | |
Line 70: | Line 87: |
`$(...)`-style command substitutions are unique in that the quoting of their contents is completely independent to their surroundings. This means you don't have to worry about nested quote escaping problems. | |
Line 73: | Line 91: |
Line 75: | Line 94: |
echo "The matching line is: $(grep foo "$filename")" # ^---------^ inner layer (quotes) # ^^--------------------^ middle layer (command sub) # ^---------------------------------------------^ outer layer (quotes) # Without the inner quotes, the value of $filename would be word-split and globbed # before being handed to grep. |
|
Line 76: | Line 103: |
The third type of quote is the '''backtick''' ({{{`}}}) or '''back quote'''. It's a deprecated markup for command substitutions, used in Bourne shells before the introduction of the `$(...)` syntax. For a discussion of the difference between {{{`...`}}} and `$(...)` please see [[BashFAQ/082]]. = Bash extensions = Bash introduces two additional forms of quoting. The first is `$'...'` which acts like single quotes except that backslash-escaped combinations are expanded as specified by the ANSI C standard. This allows a convenient way to embed nonprintable characters into strings, or to pass them as arguments. |
{{{ # bash ip=192.168.1.30 netmask=255.255.254.0 IFS=. read -ra ip_octets <<<"$ip" IFS=. read -ra netmask_octets <<<"$netmask" for i in 0 1 2 3; do ((ip_octets[i] &= netmask_octets[i])) done IFS=.; network="${ip_octets[*]}"; unset IFS }}} the `$'...'` quoting style is used to interpret backslash escapes. |
Line 88: | Line 121: |
The second form is `$"..."` and is used for [[BashFAQ/098|localization support]]. In addition to these, bash uses quotes to suppress the "specialness" of the right-hand-side of an `=` or `=~` operator inside a `[[` keyword. That sounds complicated, but it's simpler when shown as an example: |
{{{ # The following are all equivalent: echo $'hi\nthere\n' printf '%s\n' hi there # discouraged: echo -e 'hi\nthere\n' # ksh93: print 'hi\nthere\n' }}} == Exceptions == === Patterns === One important situation where quotes can have an unintended effect is in the case of [[glob|pattern matching]] contexts. Bash uses quotes to suppress the "specialness" of patterns and [[RegularExpression|regular expressions]]. Characters interpreted specially in a pattern matching context are treated literally when quoted whether the pattern is expanded from a parameter or not. |
Line 107: | Line 148: |
{{{ # Bash 4 # the "quoted" branch is taken. |
|
Line 108: | Line 152: |
See [[glob]] and RegularExpression for explanations of those terms. | g=a*[b] case $g in "$g") echo 'quoted pattern' ;;& $g) echo 'unquoted pattern' ;; esac }}} === Here Documents === Quote-removal never applies to the contents of a [[HereDocument|here document]] {{{ $ arr=(words in array); cat <<EOF > These are the "${arr[@]}" > EOF These are the "words in array" }}} The quoting of the "delimiter" in the heredoc redirection affects whether the contents of the heredoc are parsed for expansions. This quoted heredoc variety is somewhat special because it is the only context in all of bash in which a literal string of arbitrary content may be specified and is subject to no special interpretation whatsoever, except for lines beginning with the delimiter itself marking the end of the heredoc. {{{ { echo before eval "$(</dev/stdin)" # The heredoc is executed as though it were here. echo after } <<"EOF" # Put any bash code at all here EOF }}} == See also == * http://wiki.bash-hackers.org/syntax/quoting * http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02 |
In many languages, quotes are mainly used to specify that the enclosed text is to be interpreted as a string datatype, but in shell programming, almost everything is a string, so quoting in the shell has many different effects and purposes. There are multiple types of quotes which primarily enable different ways of interpreting their contents. Unfortunately, the rules are often hard for beginners to learn and their behavior varies by context with several special cases and exceptions to remember. Also unfortunately, quoting in shell programming is extremely important. It's something no one can avoid learning. Improper shell quoting is one of the most common sources of scripting bugs and security issues. Fortunately, it's possible to get it right most of the time by following a few guidelines, but do not guess about quoting. When in doubt, test, and read the man page for how quotes are to be interpreted in a given context.
Contents
Types of quoting
Single quotes: (apostrophes) prevent all special interpretation of the characters between them. Everything between single quotes is literal and expands to a single word. The only thing which cannot be single-quoted is a single quote itself.
Double quotes: are the most important and complex type of quoting in shell programming. Double quotes permit the shell to parse their contents for those types of expansions that are prefixed with a $ sign - parameter expansions, arithmetic expansions and command substitutions, while treating all other content as literal (except for the legacy backticks (`) described below or history expansion (!)). Most importantly, word splitting and filename expansion are not applied to anything within double quotes, but they also protect their contents from brace expansion, tilde expansion, process substitution, and interpretation as redirection syntax. Like single quotes, double quotes also allow for the inclusion of metacharacters and control operators without backslash escaping. Double quotes also have important effects on parameter expansions, where they can influence the number of words when expanding composite variables (arrays and positional parameters), and by not applying word splitting and pathname expansion to the results.
$'...' and $"...": Bash introduces two additional forms of quoting -- single and double quotes preceded by a $ sign. Of these,$'...' is the most common, and acts just like single quotes except that backslash-escaped combinations are expanded as specified by the ANSI C standard. This allows a convenient way to embed nonprintable characters into strings or pass them as arguments. The second form is $"..." and is used for localization support. Neither of these are specified or mentioned by POSIX, but at least the $'...' variety is widely supported by Bash, Ksh93, and relatives.
Escaping: The escape character in Bash and POSIX sh is the backslash. In addition to some of the usual uses for escape sequences in other languages, an unquoted escape in an ordinary argument evaluation context in Bash serves mostly the same function as single-quoting. A backslash followed by a glob character, metacharacter, redirect operator, and some other characters with special meaning causes them to become literal. Escapes have several other functions in other contexts, such as within pattern matching contexts and arguments corresponding to printf's %b format, including some undocumented functions, like changing the special-builtin lookup order in non-POSIX mode. Most commonly, escapes used as a form of quoting are a shortcut alternative to single-quoting a single character. e.g. you might write \* instead of '*', or combine multiple words into a single word using foo\ bar instead of 'foo bar' (or more accurately, foo' 'bar). TODO: write more about escaping.
Note: Although backticks (`) are a type of quotes linguistically, they don't actually "quote" anything in bash. Where quoting in bash is used to make data (partly) literal, backticks do something entirely different. They're the old Bourne syntax for command substitution. They mostly do the same thing as $(...), but there are a handful of weird exceptions. Either way, it doesn't matter that much: Their use is deprecated in favor of $(...), and you really shouldn't use these things anymore in modern or maintained bash code. Take away from this that if you see them, they do NOT void the need for quoting (just like "$(command)", you should double-quote them: "`command`"), and you should probably just replace them with $().
Effects of Quoting
Preserve unescaped metacharacters
A shell command is parsed by the shell into words, using whitespace (regardless of the IFS variable) and other shell metacharacters. The first function of quoting is to permit words to contain these metacharacters.
echo '&'
Without quotes, the & would put the echo command into the background. With quotes, the & is simply made into a word, and passed as an argument to the echo command instead.
The quotes are not actually passed along to the command. They are removed by the shell (this process is cleverly called "quote removal"). In the example above, the echo command sees only the &, not the quotes.
Prevent field splitting and ignore glob pattern characters
The second purpose of quoting is to prevent word splitting and globbing. The result of a single-quoted or double-quoted substitution does not undergo any further processing. (The result of an unquoted substitution does.)
cp -- "$filename" "$destination"
In this example, the double quotes protect the value of each parameter (variable) from undergoing word splitting or globbing should it happen to contain whitespace or wildcard characters (* or ? or [...]). Without the quotes, a filename like hot stuff.mp3 would be split into two words, and each word would be passed to the cp command as a separate argument. Or, a filename that contains * with whitespace around it would produce one word for every file in the current directory. That is not what we want.
With the quotes, every character in the value of the filename parameter is treated literally, and the whole value becomes the second argument to the cp command.
When in doubt, always double-quote your parameter expansions.
You may concatenate the various types of quoting if you need to. For example, if you have one section of a string that has lots of special characters that you'd like to single-quote, and another section with a parameter expansion in it which must be double-quoted, you may mix them:
echo '!%$*&'"$foo"
Any number of quoted substrings, of any style, may be concatenated in this manner. The result (after appropriate expansions in the double-quoted sections) is a single word.
Expand argument lists
Double-quoting $@ or ${array[@]} has a special meaning. "$@" expands to a list of words, with each positional parameter's value being one word. Likewise, "${array[@]}" expands to a list words, one per array element. When dealing with the positional parameters or with the contents of an array as a list of words, always use the double-quoted syntax.
Double-quoting $* or ${array[*]} results in one word which is the concatenation of all the positional parameters (or array elements) with the first character of IFS between them. This is similar to the join function in some other languages, although the fact that you can only have a single join character can sometimes be a crippling limitation.
Examples
Here are some assorted examples, to show how things should be done. Some of these examples use bash/ksh syntax that won't work in strict POSIX shells.
Proper iteration over the positional parameters using a quoted "$@". Never use an unquoted $@ or $*.
for file in "$@"; do ... done
As above, except an array.
for element in "${array[@]}"; do ...
Proper iteration over indexes
# bash 3.0 and higher for index in "${!array[@]}"; do ...
All of the usual expansions apply to text within the parentheses of a compound array assignment including word splitting and pathname expansion, and must be quoted and escaped in the same way as though they were to be passed as arguments to a command.
# bash or ksh93 find_opts=('(' -iname '*.jpg' -o -iname '*.gif' -o -iname '*.png' ')') find . "${find_opts[@]}" -print
First, a single quoted string followed by an unquoted, escaped single quote, followed by a single-quoted string. Second, a non-POSIX equivalent using $'...'
echo 'Don'\''t walk!' echo $'Don\'t talk!'
$(...)-style command substitutions are unique in that the quoting of their contents is completely independent to their surroundings. This means you don't have to worry about nested quote escaping problems.
echo "The matching line is: $(grep foo "$filename")" # Note that the quotes inside the $() command substitution are nested. # This looks wrong to a C programmer, but it is correct in shells. echo "The matching line is: $(grep foo "$filename")" # ^---------^ inner layer (quotes) # ^^--------------------^ middle layer (command sub) # ^---------------------------------------------^ outer layer (quotes) # Without the inner quotes, the value of $filename would be word-split and globbed # before being handed to grep.
# bash ip=192.168.1.30 netmask=255.255.254.0 IFS=. read -ra ip_octets <<<"$ip" IFS=. read -ra netmask_octets <<<"$netmask" for i in 0 1 2 3; do ((ip_octets[i] &= netmask_octets[i])) done IFS=.; network="${ip_octets[*]}"; unset IFS
the $'...' quoting style is used to interpret backslash escapes.
IFS=$' \t\n' # sets the IFS variable to the three-byte string containing # a space, a tab, and a newline
# The following are all equivalent: echo $'hi\nthere\n' printf '%s\n' hi there # discouraged: echo -e 'hi\nthere\n' # ksh93: print 'hi\nthere\n'
Exceptions
Patterns
One important situation where quotes can have an unintended effect is in the case of pattern matching contexts. Bash uses quotes to suppress the "specialness" of patterns and regular expressions. Characters interpreted specially in a pattern matching context are treated literally when quoted whether the pattern is expanded from a parameter or not.
if [[ $path = foo* ]]; then # unquoted foo* acts as a glob if [[ $path = "foo*" ]]; then # quoted "foo*" is a literal string if [[ $path =~ $some_re ]]; then # the contents of $some_re are treated as an ERE if [[ $path =~ "$some_re" ]]; then # the contents of $some_re are treated as a literal string # despite the =~ operator
# Bash 4 # the "quoted" branch is taken. g=a*[b] case $g in "$g") echo 'quoted pattern' ;;& $g) echo 'unquoted pattern' ;; esac
Here Documents
Quote-removal never applies to the contents of a here document
$ arr=(words in array); cat <<EOF > These are the "${arr[@]}" > EOF These are the "words in array"
The quoting of the "delimiter" in the heredoc redirection affects whether the contents of the heredoc are parsed for expansions. This quoted heredoc variety is somewhat special because it is the only context in all of bash in which a literal string of arbitrary content may be specified and is subject to no special interpretation whatsoever, except for lines beginning with the delimiter itself marking the end of the heredoc.
{ echo before eval "$(</dev/stdin)" # The heredoc is executed as though it were here. echo after } <<"EOF" # Put any bash code at all here EOF