Differences between revisions 14 and 35 (spanning 21 versions)
Revision 14 as of 2009-05-10 06:25:40
Size: 6049
Editor: 91
Comment: [igli] link to faq 67
Revision 35 as of 2015-03-30 09:12:15
Size: 12547
Editor: atlantic480
Comment: highlighted an important fact. Please help improving the extglob section, it is still bad work.
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Traditional shell globs use a very simple syntax, which is less expressive than a regular expression. Most characters in a glob are treated literally, but a {{{*}}} matches 0 or more characters, a {{{?}}} matches precisely one character, and {{{[...]}}} matches any single character in a specified set (see the previous reference for details). All globs are implicitly anchored at both start and end. For example:

|| `*` || Matches any string, of any length ||
|| `foo*` || Matches any string beginning with `foo` ||
|| `*x*` || Matches any string containing an `x` (beginning, middle or end) ||
|| `*.tar.gz` || Matches any string ending with `.tar.gz` ||
|| `*.[ch]` || Matches any string ending with `.c` or `.h` ||
|| `foo?` || Matches `foot` or `foo$` but not `fools` ||
Traditional shell globs use a very simple syntax, which is less expressive than a RegularExpression. Most characters in a glob are treated literally, but a {{{*}}} matches 0 or more characters, a {{{?}}} matches precisely one character, and {{{[...]}}} matches any single character in a specified set (see [[#Ranges|Ranges]] below). All globs are implicitly anchored at both start and end. For example:

||`*` ||Matches any string, of any length ||
||`foo*` ||Matches any string beginning with `foo` ||
||`*x*` ||Matches any string containing an `x` (beginning, middle or end) ||
||`*.tar.gz` ||Matches any string ending with `.tar.gz` ||
||`*.[ch]` ||Matches any string ending with `.c` or `.h` ||
||`foo?` ||Matches `foot` or `foo$` but not `fools` ||
Line 14: Line 14:
{{{ {{{#!highlight bash
Line 17: Line 17:
# (which is generally not what one wants)}}} # (which is generally not what one wants)
}}}
Line 21: Line 22:
{{{ {{{#!highlight bash
Line 35: Line 36:
Globs are also used to match patterns in a few places in Bash. The most traditional is in the [[BashGuide/TheBasics/TestsAndConditionals#Choices|case]] command:

{{{
Globs are also used to match patterns in a few places in Bash. The most traditional is in the [[BashGuide/TestsAndConditionals#Choices|case]] command:

{{{#!highlight bash
Line 49: Line 50:
{{{ {{{#!highlight bash
Line 55: Line 56:
{{{ {{{#!highlight bash
Line 59: Line 60:
IFS=$'\n'; echo "${arr[*]}" # dump an array, one element per line
IFS=$'\n'; echo "${arr[*]/error*/}" # dump array, removing error* if matched
unset IFS
}}}
printf '%s\n' "${arr[@]}" # dump an array, one element per line
printf '%s\n' "${arr[@]/error*/}" # dump array, removing error* if matched
}}}

(Reference: [[BashGuide/Arrays|Arrays]] [[Quotes]] [[http://bash-hackers.org/wiki/doku.php/commands/builtin/printf|printf]].)

== Ranges ==

Globs can specify a ''range'' or ''class'' of characters, using square brackets. This gives you the ability to match against a set of characters. For example:

||`[abcd]` ||Matches `a` or `b` or `c` or `d`||
||`[a-d]` ||The same as above, if your [[locale]] is C or POSIX. Otherwise, implementation-defined.||
||`[!aeiouAEIOU]` ||Matches any character ''except'' `a`, `e`, `i`, `o`, `u` and their uppercase counterparts||
||`[[:alnum:]]` ||Matches any alphanumeric character in the current locale (letter or number)||
||`[[:space:]]` ||Matches any whitespace character||
||`[![:space:]]` ||Matches any character that is ''not'' whitespace||
||`[[:digit:]_.]` ||Matches any digit, or `_` or `.`||

''Implementation-defined'' means it may work as you expect on one machine, but give completely different results on another machine. Do not use the `m-n` syntax unless you have explicitly set your locale to C first, or you may get unexpected results. The [[http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05|POSIX character class expressions]] should be preferred whenever possible.

== Options which change globbing behavior ==

=== extglob ===
Line 66: Line 86:
{{{ {{{#!highlight bash
Line 82: Line 102:
{{{ {{{#!highlight bash
Line 89: Line 109:
{{{ {{{#!highlight bash
Line 96: Line 116:
{{{ {{{#!highlight bash
Line 103: Line 123:
{{{ {{{#!highlight bash
Line 107: Line 127:
Because the `extglob` option changes the way certain characters are parsed, it is necessary to have a newline (not just a semicolon) between the `shopt` command and any subsequent commands that use extended globs. Likewise, you cannot put `shopt -s extglob` inside a function that uses extended globs, because the function as a whole must be parsed when it's defined; the `shopt` command won't take effect until the function is ''called'', at which point it's too late. Therefore, if you use this option in a script, it's best to put it right under the shebang line, or as close as you can get it while still making your boss happy. '''Because `extglob` and other glob options change the way certain characters are parsed, it is necessary to have a newline (not just a semicolon) between the `shopt` command and any subsequent commands to use the new globbing.'''
As an example, you cannot put `shopt -s extglob` inside a [[BashGuide/CompoundCommands#Command\ Grouping|group command]] that uses extended globs, because the entire block is parsed before the `shopt` is ''evaluated''.
Note that the typical [[BashGuide/CompoundCommands#Functions|function]] body ''is'' a ''group command''. An unpleasant workaround could be to use a ''subshell command list'' as the function body.

Therefore, if you use this option in a script, it's best to put it right under the shebang line, or as close as you can get it while still making your boss happy.

{{{#!highlight bash
#!/usr/bin/env bash
# Copyright (c) 2012 Foo Corporation
shopt -s extglob # and others, such as nullglob dotglob
}}}

If your code isn't a ''script'', but is instead being sourced, and must set `extglob` itself:
{{{#!highlight bash
file:
   shopt -q extglob || shopt -s extglob
   # enable extglob if not already set

   # The basic concept behind the following is to delay parsing of the globs until evaluation.
   # This matters at group commands, such as functions in { } blocks

   declare -a s='( !(x) )'
   echo "${s[@]}"

   echo "${InvalidVar:-!(x)}"

   eval 'echo !(x)' # using eval if no other option.

   shopt -q extglob && shopt -u extglob
   # disable extglob if set

myscript:
   shopt -u extglob
   # extglob was incidentally disabled
   . file
}}}

=== nullglob ===

If a glob fails to match any filenames, the shell normally leaves it alone. This means the raw glob will be passed on to the command, as in:

{{{#!highlight bash
$ ls *.ttx
ls: cannot access *.ttx: No such file or directory
}}}

This allows the command to see the glob you used, and to use it in an error message. If the Bash option '''nullglob''' is set, however, a glob which matches no files will be removed entirely. This is [[BashFAQ/004|useful in scripts]], but somewhat confusing at the command line, since it "breaks" the expectations of many of the standard tools (see failglob below for a better alternative):

{{{#!highlight bash
# Good in scripts!
shopt -s nullglob
oggs=(*.ogg)
for ogg in "${oggs[@]}"; do ...

# Bad at the command line!
shopt -s nullglob
ls *.ttx
# Runs "ls" with no arguments, and lists EVERYTHING
}}}

=== dotglob ===

By convention, a filename beginning with a dot is "hidden", and not shown by `ls`. Globbing uses the same convention -- filenames beginning with a dot are not matched by a glob, unless the glob also begins with a dot. Bash has a '''dotglob''' option that lets globs match "dot files":

{{{#!highlight bash
shopt -s dotglob nullglob
files=(*)
echo "There are ${#files[@]} files here, including dot files and subdirs"
}}}

It should be noted that when `dotglob` is enabled, `*` will match files like `.bashrc` but ''not'' the `.` or `..` directories. This is orthogonal to the problem of matching "just the dot files" -- a glob of `.*` ''will'' match `.` and `..`, typically causing problems.

=== globstar (since bash 4.0-alpha) ===

To recurse …

{{{#!highlight bash
shopt -s globstar
files=(*)
echo "There are ${#files[@]} files here, including dot files and subdirs"
}}}

=== failglob ===

If a pattern fails to match, bash reports an expansion error. This can be useful at the commandline:

{{{#!highlight bash
# Good at the command line!
$ touch *.foo # creates file '*.foo' if glob fails to match
$ shopt -s failglob
$ touch *.foo # touch doesn't get executed
-bash: no match: *.foo
}}}

=== GLOBIGNORE ===

The Bash variable (not shopt) `GLOBIGNORE` allows you to specify patterns a glob ''should not'' match. This lets you work around the infamous "I want to match all of my dot files, but not . or .." problem:

{{{#!highlight bash
$ echo .*
. .. .bash_history .bash_logout .bashrc .inputrc .vimrc
$ GLOBIGNORE=.:..
$ echo .*
.bash_history .bash_logout .bashrc .inputrc .vimrc
}}}

Unset GLOBIGNORE

{{{#!highlight bash
$ GLOBIGNORE=
$ echo .*
. .. .bash_history .bash_logout .bashrc .inputrc .vimrc
}}}

=== nocasematch ===

{{{#!highlight bash
foo() {
   local f r=0 nc=0
   shopt -q nocasematch && nc=1 || shopt -s nocasematch
   for f; do
      [[ $f = *.@(txt|jpg) ]] || continue
      cmd -on "$f" || r=1
   done
   ((nc)) || shopt -u nocasematch
   return $r
}
}}}

This is conventionally done with a case:
{{{#!highlight bash
case $f in
    *.[Tt][Xx][Tt]|*.[Jj][Pp][Gg]) : ;;
    *) continue
esac
}}}

and in earlier versions of bash we'd use a similar glob:
{{{#!highlight bash
[[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp][Gg]) ]] || continue
}}}
or with no extglob:
{{{#!highlight bash
[[ $f = *.[Tt][Xx][Tt] ]] || [[ $f = *.[Jj][Pp][Gg] ]] || continue
}}}
Here, one might keep the tests separate for maintenance; they can be easily reused and dropped,
 without having to concern oneself with where they fit in relation to an internal ||.

Note also:
{{{#!highlight bash
[[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp]?([Ee])[Gg]) ]]
}}}
Variants left as an exercise.

=== nocaseglob (since bash 2.02-alpha1) ===

text here …

=== dirspell (since bash 4.04-alpha) ===

text here …

=== direxpand (since bash 4.3-alpha) ===

text here …

--enable-direxpand-default

=== globasciiranges (since bash 4.3-alpha) ===

Interprets [a-d] as [abcd]. To match a literal '-', include it as first or last character.

"Glob" is the common name for a set of Bash features that match or expand specific types of patterns. Some synonyms for globbing (depending on the context in which it appears) are pattern matching, pattern expansion, filename expansion, and so on. A glob may look like *.txt and, when used to match filenames, is sometimes called a "wildcard".

Traditional shell globs use a very simple syntax, which is less expressive than a RegularExpression. Most characters in a glob are treated literally, but a * matches 0 or more characters, a ? matches precisely one character, and [...] matches any single character in a specified set (see Ranges below). All globs are implicitly anchored at both start and end. For example:

*

Matches any string, of any length

foo*

Matches any string beginning with foo

*x*

Matches any string containing an x (beginning, middle or end)

*.tar.gz

Matches any string ending with .tar.gz

*.[ch]

Matches any string ending with .c or .h

foo?

Matches foot or foo$ but not fools

Bash expands globs which appear unquoted in commands, by matching filenames relative to the current directory. The expansion of the glob results in 1 or more words (0 or more, if certain options are set), and those words (filenames) are used in the command. For example:

   1 tar xvf *.tar
   2 # Expands to: tar xvf file1.tar file2.tar file42.tar ...
   3 # (which is generally not what one wants)

Even if a file contains internal whitespace, the expansion of a glob that matches that file will still preserve each filename as a single word. For example,

   1 # This is safe even if a filename contains whitespace:
   2 for f in *.tar; do
   3     tar tvf "$f"
   4 done
   5 
   6 # But this one is not:
   7 for f in $(ls | grep '\.tar$'); do
   8     tar tvf "$f"
   9 done

In the second example above, the output of ls is filtered, and then the result of the whole pipeline is divided into words, to serve as iterative values for the loop. This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. The first example has no such problem, because the filenames produced by the glob do not undergo any further word-splitting. For more such examples, see BashPitfalls.

Globs are also used to match patterns in a few places in Bash. The most traditional is in the case command:

   1 case "$input" in
   2     [Yy]|'') confirm=1;;
   3     [Nn]*) confirm=0;;
   4     *) echo "I don't understand.  Please try again.";;
   5 esac

Patterns (which are separated by | characters) are matched against the first word after the case itself. The first pattern which matches, "wins", causing the corresponding commands to be executed.

Bash also allows globs to appear on the right-hand side of a comparison inside a [[ command:

   1 if [[ $output = *[Ee]rror* ]]; then ...

Finally, globs are used during parameter expansion to indicate patterns which may be stripped out, or replaced, during a substitution. Simple examples (there are many more on the previously referenced page):

   1 filename=${path##*/}    # strip leading pattern that matches */ (be greedy)
   2 dirname=${path%/*}      # strip trailing pattern matching /* (non-greedy)
   3 
   4 printf '%s\n' "${arr[@]}"          # dump an array, one element per line
   5 printf '%s\n' "${arr[@]/error*/}"  # dump array, removing error* if matched

(Reference: Arrays Quotes printf.)

Ranges

Globs can specify a range or class of characters, using square brackets. This gives you the ability to match against a set of characters. For example:

[abcd]

Matches a or b or c or d

[a-d]

The same as above, if your locale is C or POSIX. Otherwise, implementation-defined.

[!aeiouAEIOU]

Matches any character except a, e, i, o, u and their uppercase counterparts

[[:alnum:]]

Matches any alphanumeric character in the current locale (letter or number)

[[:space:]]

Matches any whitespace character

[![:space:]]

Matches any character that is not whitespace

[[:digit:]_.]

Matches any digit, or _ or .

Implementation-defined means it may work as you expect on one machine, but give completely different results on another machine. Do not use the m-n syntax unless you have explicitly set your locale to C first, or you may get unexpected results. The POSIX character class expressions should be preferred whenever possible.

Options which change globbing behavior

extglob

In addition to the traditional globs (supported by all Bourne-family shells) that we've seen so far, Bash (and Korn Shell) offers extended globs, which have the expressive power of regular expressions. Korn shell enables these by default; in Bash, you must run the command

   1 shopt -s extglob

in your shell (or at the start of your script -- see note on parsing below) to use them. The pattern matching reference describes the syntax, which is reproduced here:

?(pattern-list)
Matches zero or one occurrence of the given patterns.
*(pattern-list)
Matches zero or more occurrences of the given patterns.
+(pattern-list)
Matches one or more occurrences of the given patterns.
@(pattern-list)
Matches one of the given patterns.
!(pattern-list)
Matches anything except one of the given patterns.

Patterns in a list are separated by | characters.

Extended globs allow you to solve a number of problems which otherwise require a rather surprising amount of ugly hacking; for example,

   1 # To remove all the files except ones matching *.jpg:
   2 rm !(*.jpg)
   3 # All except *.jpg and *.gif and *.png:
   4 rm !(*.jpg|*.gif|*.png)

   1 # To copy all the MP3 songs except one to your device
   2 cp !(04*).mp3 /mnt

To use an extglob in a parameter expansion (this can also be done in one BASH statement with read):

   1 # To trim leading and trailing whitespace from a variable
   2 x=${x##+([[:space:]])}; x=${x%%+([[:space:]])}

Extended glob patterns can be nested, too.

   1 [[ $fruit = @(ba*(na)|a+(p)le) ]] && echo 'Nice fruit'

Because extglob and other glob options change the way certain characters are parsed, it is necessary to have a newline (not just a semicolon) between the shopt command and any subsequent commands to use the new globbing. As an example, you cannot put shopt -s extglob inside a group command that uses extended globs, because the entire block is parsed before the shopt is evaluated. Note that the typical function body is a group command. An unpleasant workaround could be to use a subshell command list as the function body.

Therefore, if you use this option in a script, it's best to put it right under the shebang line, or as close as you can get it while still making your boss happy.

   1 #!/usr/bin/env bash
   2 # Copyright (c) 2012 Foo Corporation
   3 shopt -s extglob   # and others, such as nullglob dotglob

If your code isn't a script, but is instead being sourced, and must set extglob itself:

   1 file:
   2    shopt -q extglob || shopt -s extglob
   3    # enable extglob if not already set
   4 
   5    # The basic concept behind the following is to delay parsing of the globs until evaluation.
   6    # This matters at group commands, such as functions in { } blocks
   7 
   8    declare -a s='( !(x) )'
   9    echo "${s[@]}"
  10 
  11    echo "${InvalidVar:-!(x)}"
  12 
  13    eval 'echo !(x)'  # using eval if no other option.
  14 
  15    shopt -q extglob && shopt -u extglob
  16    # disable extglob if set
  17 
  18 myscript:
  19    shopt -u extglob
  20    # extglob was incidentally disabled
  21    . file

nullglob

If a glob fails to match any filenames, the shell normally leaves it alone. This means the raw glob will be passed on to the command, as in:

   1 $ ls *.ttx
   2 ls: cannot access *.ttx: No such file or directory

This allows the command to see the glob you used, and to use it in an error message. If the Bash option nullglob is set, however, a glob which matches no files will be removed entirely. This is useful in scripts, but somewhat confusing at the command line, since it "breaks" the expectations of many of the standard tools (see failglob below for a better alternative):

   1 # Good in scripts!
   2 shopt -s nullglob
   3 oggs=(*.ogg)
   4 for ogg in "${oggs[@]}"; do ...
   5 
   6 # Bad at the command line!
   7 shopt -s nullglob
   8 ls *.ttx
   9 # Runs "ls" with no arguments, and lists EVERYTHING

dotglob

By convention, a filename beginning with a dot is "hidden", and not shown by ls. Globbing uses the same convention -- filenames beginning with a dot are not matched by a glob, unless the glob also begins with a dot. Bash has a dotglob option that lets globs match "dot files":

   1 shopt -s dotglob nullglob
   2 files=(*)
   3 echo "There are ${#files[@]} files here, including dot files and subdirs"

It should be noted that when dotglob is enabled, * will match files like .bashrc but not the . or .. directories. This is orthogonal to the problem of matching "just the dot files" -- a glob of .* will match . and .., typically causing problems.

globstar (since bash 4.0-alpha)

To recurse …

   1 shopt -s globstar
   2 files=(*)
   3 echo "There are ${#files[@]} files here, including dot files and subdirs"

failglob

If a pattern fails to match, bash reports an expansion error. This can be useful at the commandline:

   1 # Good at the command line!
   2 $ touch *.foo # creates file '*.foo' if glob fails to match
   3 $ shopt -s failglob
   4 $ touch *.foo # touch doesn't get executed
   5 -bash: no match: *.foo

GLOBIGNORE

The Bash variable (not shopt) GLOBIGNORE allows you to specify patterns a glob should not match. This lets you work around the infamous "I want to match all of my dot files, but not . or .." problem:

   1 $ echo .*
   2 . .. .bash_history .bash_logout .bashrc .inputrc .vimrc
   3 $ GLOBIGNORE=.:..
   4 $ echo .*
   5 .bash_history .bash_logout .bashrc .inputrc .vimrc

Unset GLOBIGNORE

   1 $ GLOBIGNORE=
   2 $ echo .*
   3 . .. .bash_history .bash_logout .bashrc .inputrc .vimrc

nocasematch

   1 foo() {
   2    local f r=0 nc=0
   3    shopt -q nocasematch && nc=1 || shopt -s nocasematch
   4    for f; do
   5       [[ $f = *.@(txt|jpg) ]] || continue
   6       cmd -on "$f" || r=1
   7    done
   8    ((nc)) || shopt -u nocasematch
   9    return $r
  10 }

This is conventionally done with a case:

   1 case $f in
   2     *.[Tt][Xx][Tt]|*.[Jj][Pp][Gg]) : ;;
   3     *) continue
   4 esac

and in earlier versions of bash we'd use a similar glob:

   1 [[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp][Gg]) ]] || continue

or with no extglob:

   1 [[ $f = *.[Tt][Xx][Tt] ]] || [[ $f = *.[Jj][Pp][Gg] ]] || continue

Here, one might keep the tests separate for maintenance; they can be easily reused and dropped,

  • without having to concern oneself with where they fit in relation to an internal ||.

Note also:

   1 [[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp]?([Ee])[Gg]) ]]

Variants left as an exercise.

nocaseglob (since bash 2.02-alpha1)

text here …

dirspell (since bash 4.04-alpha)

text here …

direxpand (since bash 4.3-alpha)

text here …

--enable-direxpand-default

globasciiranges (since bash 4.3-alpha)

Interprets [a-d] as [abcd]. To match a literal '-', include it as first or last character.


CategoryShell

glob (last edited 2022-10-13 13:52:20 by emanuele6)