15524
Comment: updated globstar for finding all *.c files recursively
|
15757
update bash-hackers link
|
Deletions are marked like this. | Additions are marked like this. |
Line 4: | Line 4: |
"Glob" is the common name for a set of Bash features that match or expand specific types of patterns. Some synonyms for globbing (depending on the context in which it appears) are [[http://tiswww.case.edu/php/chet/bash/bashref.html#SEC36|pattern matching]], pattern expansion, filename expansion, and so on. A glob may look like {{{*.txt}}} and, when used to match filenames, is sometimes called a "wildcard". |
"Glob" is the common name for a set of Bash features that match or expand specific types of patterns. Some synonyms for globbing (depending on the context in which it appears) are [[http://tiswww.case.edu/php/chet/bash/bashref.html#Pattern-Matching|pattern matching]], pattern expansion, filename expansion, and so on. A glob may look like {{{*.txt}}} and, when used to match filenames, is sometimes called a "wildcard". |
Line 8: | Line 7: |
Line 16: | Line 14: |
Line 23: | Line 24: |
Line 37: | Line 37: |
Line 49: | Line 48: |
Line 57: | Line 55: |
Line 67: | Line 64: |
(Reference: [[BashGuide/Arrays|Arrays]] [[Quotes]] [[http://bash-hackers.org/wiki/doku.php/commands/builtin/printf|printf]].) |
(Reference: [[BashGuide/Arrays|Arrays]] [[Quotes]] [[http://wiki.bash-hackers.org/commands/builtin/printf|printf]].) |
Line 71: | Line 67: |
Line 73: | Line 68: |
||`[abcd]` ||Matches `a` or `b` or `c` or `d`|| ||`[a-d]` ||The same as above, if ''globasciiranges'' is set or your [[locale]] is C or POSIX. Otherwise, implementation-defined.|| ||`[!aeiouAEIOU]` ||Matches any character ''except'' `a`, `e`, `i`, `o`, `u` and their uppercase counterparts|| ||`[[:alnum:]]` ||Matches any alphanumeric character in the current locale (letter or number)|| ||`[[:space:]]` ||Matches any whitespace character|| ||`[![:space:]]` ||Matches any character that is ''not'' whitespace|| ||`[[:digit:]_.]` ||Matches any digit, or `_` or `.`|| |
||`[abcd]` ||Matches `a` or `b` or `c` or `d` || ||`[a-d]` ||The same as above, if ''globasciiranges'' is set or your [[locale]] is C or POSIX. Otherwise, implementation-defined. || ||`[!aeiouAEIOU]` ||Matches any character ''except'' `a`, `e`, `i`, `o`, `u` and their uppercase counterparts || ||`[[:alnum:]]` ||Matches any alphanumeric character in the current locale (letter or number) || ||`[[:space:]]` ||Matches any whitespace character || ||`[![:space:]]` ||Matches any character that is ''not'' whitespace || ||`[[:digit:]_.]` ||Matches any digit, or `_` or `.` || In most shell implementations, one may also use `^` as the range negation character, e.g. `[^[:space:]]`. However, POSIX specifies `!` for this role, and therefore `!` is the standard choice. |
Line 83: | Line 82: |
Line 87: | Line 85: |
Line 91: | Line 88: |
Line 93: | Line 89: |
Line 99: | Line 94: |
Line 118: | Line 112: |
Line 123: | Line 116: |
Line 130: | Line 122: |
Line 136: | Line 127: |
'''`extglob` changes the way certain characters are parsed. It is necessary to have a newline (not just a semicolon) between `shopt -s extglob` and any subsequent commands to use it.''' You cannot enable extended globs inside a [[BashGuide/CompoundCommands#Command\ Grouping|group command]] that uses them, because the entire block is parsed before the `shopt` is ''evaluated''. Note that the typical [[BashGuide/CompoundCommands#Functions|function]] body ''is'' a ''group command''. An unpleasant workaround could be to use a ''subshell command list'' as the function body. |
'''`extglob` changes the way certain characters are parsed. It is necessary to have a newline (not just a semicolon) between `shopt -s extglob` and any subsequent commands to use it.''' You cannot enable extended globs inside a [[BashGuide/CompoundCommands#Command.2BAFw_Grouping|group command]] that uses them, because the entire block is parsed before the `shopt` is ''evaluated''. Note that the typical [[BashGuide/CompoundCommands#Functions|function]] body ''is'' a ''group command''. An unpleasant workaround could be to use a ''subshell command list'' as the function body. |
Line 147: | Line 135: |
Line 170: | Line 157: |
Line 174: | Line 160: |
`nullglob` expands non-matching globs to zero arguments, rather than to themselves. |
`nullglob` expands non-matching globs to zero arguments, rather than to themselves. |
Line 186: | Line 171: |
Line 194: | Line 178: |
Without ''nullglob'', the glob would expand to a literal `*` in an empty directory, resulting in an erroneous count of 1. |
Without ''nullglob'', the glob would expand to a literal `*` in an empty directory, resulting in an erroneous count of 1. |
Line 199: | Line 181: |
Enabling `nullglob` on a wide scope can trigger bugs caused by bad programming practices. It "breaks" the expectations of many utilities. |
Enabling `nullglob` on a wide scope can trigger bugs caused by bad programming practices. It "breaks" the expectations of many utilities. |
Line 204: | Line 184: |
Line 211: | Line 192: |
Line 213: | Line 193: |
Line 220: | Line 201: |
Apart from few builtins that use modified parsing under special conditions (e.g. declare) '''always use [[Quotes]]''' when arguments to simple commands could be interpreted as globs. Enabling `failglob`, `nullglob`, or both during development and testing can help catch mistakes early. To prevent ''pathname expansion'' occuring in unintended places, you can set [[#failglob|failglob]]. However, you must then guarantee all intended globs match at least one file. Also note that the result of a glob expansion does not always differ from the glob itself. `failglob` won't distinguish `echo ?` from `echo '?'` in a directory containing only a file named `?`. `nullglob` will. |
EDIT: It has been changed in bash4.3; now, it results in an array that only contains {{{array[1]='*'}}}. Apart from few builtins that use modified parsing under special conditions (e.g. declare) '''always use [[Quotes]]''' when arguments to simple commands could be interpreted as globs. Enabling `failglob`, `nullglob`, or both during development and testing can help catch mistakes early. To prevent ''pathname expansion'' occuring in unintended places, you can set [[#failglob|failglob]]. However, you must then guarantee all intended globs match at least one file. Also note that the result of a glob expansion does not always differ from the glob itself. `failglob` won't distinguish `echo ?` from `echo '?'` in a directory containing only a file named `?`. `nullglob` will. |
Line 242: | Line 221: |
for x; do | for x do |
Line 264: | Line 243: |
In zsh, an toggle-able option(NULL_GLOB) or a glob qualifier(N) can be used. | In zsh, a toggle-able option (NULL_GLOB) or a glob qualifier(N) can be used. |
Line 272: | Line 252: |
Line 280: | Line 259: |
Line 284: | Line 262: |
Line 291: | Line 268: |
│ ├── directory3 │ ├── file1.c │ └── file2 |
│ ├── directory3 │ ├── file1.c │ └── file2 |
Line 299: | Line 276: |
Line 313: | Line 289: |
Line 324: | Line 299: |
Line 326: | Line 300: |
Line 336: | Line 309: |
Line 338: | Line 310: |
Line 348: | Line 319: |
Line 356: | Line 326: |
Line 358: | Line 327: |
Line 373: | Line 341: |
Line 375: | Line 342: |
Line 381: | Line 349: |
Line 383: | Line 350: |
Line 387: | Line 355: |
Line 391: | Line 360: |
without having to concern oneself with where they fit in relation to an internal ||. | . without having to concern oneself with where they fit in relation to an internal ||. |
Line 394: | Line 364: |
Line 400: | Line 371: |
This option makes ''pathname expansion'' case-insensitive. In contrast, [[#nocasematch|nocasematch]] operates on matches in ''[[BashGuide/TestsAndConditionals#Conditional_Blocks|[[]]'' and ''[[BashGuide/TestsAndConditionals#Choices|case]]'' commands. |
This option makes ''pathname expansion'' case-insensitive. In contrast, [[#nocasematch|nocasematch]] operates on matches in ''[[BashGuide/TestsAndConditionals#Conditional_Blocks|[[]]'' and ''[[BashGuide/TestsAndConditionals#Choices|case]]'' commands. |
Contents
Globs
"Glob" is the common name for a set of Bash features that match or expand specific types of patterns. Some synonyms for globbing (depending on the context in which it appears) are pattern matching, pattern expansion, filename expansion, and so on. A glob may look like *.txt and, when used to match filenames, is sometimes called a "wildcard".
Traditional shell globs use a very simple syntax, which is less expressive than a RegularExpression. Most characters in a glob are treated literally, but a * matches 0 or more characters, a ? matches precisely one character, and [...] matches any single character in a specified set (see Ranges below). All globs are implicitly anchored at both start and end. For example:
* |
Matches any string, of any length |
foo* |
Matches any string beginning with foo |
*x* |
Matches any string containing an x (beginning, middle or end) |
*.tar.gz |
Matches any string ending with .tar.gz |
*.[ch] |
Matches any string ending with .c or .h |
foo? |
Matches foot or foo$ but not fools |
Bash expands globs which appear unquoted in commands, by matching filenames relative to the current directory. The expansion of the glob results in 1 or more words (0 or more, if certain options are set), and those words (filenames) are used in the command. For example:
Even if a file contains internal whitespace, the expansion of a glob that matches that file will still preserve each filename as a single word. For example,
In the second example above, the output of ls is filtered, and then the result of the whole pipeline is divided into words, to serve as iterative values for the loop. This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. The first example has no such problem, because the filenames produced by the glob do not undergo any further word-splitting. For more such examples, see BashPitfalls.
Globs are also used to match patterns in a few places in Bash. The most traditional is in the case command:
Patterns (which are separated by | characters) are matched against the first word after the case itself. The first pattern which matches, "wins", causing the corresponding commands to be executed.
Bash also allows globs to appear on the right-hand side of a comparison inside a [[ command:
1 if [[ $output = *[Ee]rror* ]]; then ...
Finally, globs are used during parameter expansion to indicate patterns which may be stripped out, or replaced, during a substitution. Simple examples (there are many more on the previously referenced page):
(Reference: Arrays Quotes printf.)
Ranges
Globs can specify a range or class of characters, using square brackets. This gives you the ability to match against a set of characters. For example:
[abcd] |
Matches a or b or c or d |
[a-d] |
The same as above, if globasciiranges is set or your locale is C or POSIX. Otherwise, implementation-defined. |
[!aeiouAEIOU] |
Matches any character except a, e, i, o, u and their uppercase counterparts |
[[:alnum:]] |
Matches any alphanumeric character in the current locale (letter or number) |
[[:space:]] |
Matches any whitespace character |
[![:space:]] |
Matches any character that is not whitespace |
[[:digit:]_.] |
Matches any digit, or _ or . |
In most shell implementations, one may also use ^ as the range negation character, e.g. [^[:space:]]. However, POSIX specifies ! for this role, and therefore ! is the standard choice.
globasciiranges (since bash 4.3-alpha)
Interprets [a-d] as [abcd]. To match a literal -, include it as first or last character.
For older versions
Note that Implementation-defined means it may work as you expect on one machine, but give completely different results on another machine. Do not use the m-n syntax unless you have explicitly set your locale to C first, or you may get unexpected results. The POSIX character class expressions should be preferred whenever possible.
Options which change globbing behavior
extglob
In addition to the traditional globs (supported by all Bourne-family shells) that we've seen so far, Bash (and Korn Shell) offers extended globs, which have the expressive power of regular expressions. Korn shell enables these by default; in Bash, you must run the command
1 shopt -s extglob
in your shell (or at the start of your script -- see note on parsing below) to use them. The pattern matching reference describes the syntax, which is reproduced here:
- ?(pattern-list)
- Matches zero or one occurrence of the given patterns.
- *(pattern-list)
- Matches zero or more occurrences of the given patterns.
- +(pattern-list)
- Matches one or more occurrences of the given patterns.
- @(pattern-list)
- Matches one of the given patterns.
- !(pattern-list)
- Matches anything except one of the given patterns.
Patterns in a list are separated by | characters.
Extended globs allow you to solve a number of problems which otherwise require a rather surprising amount of ugly hacking; for example,
To use an extglob in a parameter expansion (this can also be done in one BASH statement with read):
Extended glob patterns can be nested, too.
1 [[ $fruit = @(ba*(na)|a+(p)le) ]] && echo "Nice fruit"
extglob changes the way certain characters are parsed. It is necessary to have a newline (not just a semicolon) between shopt -s extglob and any subsequent commands to use it. You cannot enable extended globs inside a group command that uses them, because the entire block is parsed before the shopt is evaluated. Note that the typical function body is a group command. An unpleasant workaround could be to use a subshell command list as the function body.
Therefore, if you use this option in a script, it is best put right under the shebang line.
If your code must be sourced and needs extglob, ensure it preserves the original setting from your shell:
1 # remember whether extglob was originally set, so we know whether to unset it
2 shopt -q extglob; extglob_set=$?
3 # set extglob if it wasn't originally set.
4 ((extglob_set)) && shopt -s extglob
5 # Note, 0 (true) from shopt -q is "false" in a math context.
6
7 # The basic concept behind the following is to delay parsing of the globs until evaluation.
8 # This matters at group commands, such as functions in { } blocks
9
10 declare -a s='( !(x) )'
11 echo "${s[@]}"
12
13 echo "${InvalidVar:-!(x)}"
14
15 eval 'echo !(x)' # using eval if no other option.
16
17 # unset extglob if it wasn't originally set
18 ((extglob_set)) && shopt -u extglob
This should also apply for other shell options.
nullglob
nullglob expands non-matching globs to zero arguments, rather than to themselves.
Typically, nullglob is used to count the number of files matching a pattern:
Without nullglob, the glob would expand to a literal * in an empty directory, resulting in an erroneous count of 1.
Warning
Enabling nullglob on a wide scope can trigger bugs caused by bad programming practices. It "breaks" the expectations of many utilities.
Removing array elements:
Array member assignments in compound form using subscripts:
This was reported as a bug in 2012, yet is unchanged to this day.
EDIT: It has been changed in bash4.3; now, it results in an array that only contains array[1]='*'.
Apart from few builtins that use modified parsing under special conditions (e.g. declare) always use Quotes when arguments to simple commands could be interpreted as globs.
Enabling failglob, nullglob, or both during development and testing can help catch mistakes early. To prevent pathname expansion occuring in unintended places, you can set failglob. However, you must then guarantee all intended globs match at least one file. Also note that the result of a glob expansion does not always differ from the glob itself. failglob won't distinguish echo ? from echo '?' in a directory containing only a file named ?. nullglob will.
Portability
"null globbing" is not specified by POSIX. In portable scripts, you must explicitly check that a glob match was successful by checking that the files actually exist.
Some modern POSIX-compatible shells allow null globbing as an extension.
In ksh93, there is no toggle-able option. Rather, that the "nullglob" behavior is to be enabled is specified inline using the "N" option to the ∼() sub-pattern syntax.
In zsh, a toggle-able option (NULL_GLOB) or a glob qualifier(N) can be used.
mksh doesn't yet support nullglob (maintainer says he'll think about it).
dotglob
By convention, a filename beginning with a dot is "hidden", and not shown by ls. Globbing uses the same convention -- filenames beginning with a dot are not matched by a glob, unless the glob also begins with a dot. Bash has a dotglob option that lets globs match "dot files":
It should be noted that when dotglob is enabled, * will match files like .bashrc but not the . or .. directories. This is orthogonal to the problem of matching "just the dot files" -- a glob of .* will match . and .., typically causing problems.
globstar (since bash 4.0-alpha)
globstar recursively repeats a pattern containing '**'.
Matching files:
Just like '*', '**' followed by a '/' will only match directories:
failglob
If a pattern fails to match, bash reports an expansion error. This can be useful at the commandline:
GLOBIGNORE
The Bash variable (not shopt) GLOBIGNORE allows you to specify patterns a glob should not match. This lets you work around the infamous "I want to match all of my dot files, but not . or .." problem:
Unset GLOBIGNORE
nocasematch
Globs inside [[ and case commands are matched case-insensitive:
This is conventionally done this way:
and in earlier versions of bash we'd use a similar glob:
1 [[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp][Gg]) ]] || continue
or with no extglob:
1 [[ $f = *.[Tt][Xx][Tt] ]] || [[ $f = *.[Jj][Pp][Gg] ]] || continue
Here, one might keep the tests separate for maintenance; they can be easily reused and dropped,
without having to concern oneself with where they fit in relation to an internal ||.
Note also:
1 [[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp]?([Ee])[Gg]) ]]
Variants left as an exercise.
nocaseglob (since bash 2.02-alpha1)
This option makes pathname expansion case-insensitive. In contrast, nocasematch operates on matches in [[ and case commands.