Differences between revisions 12 and 18 (spanning 6 versions)
Revision 12 as of 2008-10-30 13:17:38
Size: 5626
Editor: GreyCat
Comment: link to NullGlob which now I'll have to write....
Revision 18 as of 2009-02-25 17:04:13
Size: 6971
Editor: NeilMoore
Comment: fix delimiter problems with loop (2 edits, forgot to comment the previous)
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
[[Anchor(faq5)]] <<Anchor(faq5)>>
Line 3: Line 3:

BASH and KornShell have one-dimensional arrays indexed by a numerical expression, e.g.
BASH and KornShell have one-dimensional arrays indexed by a numerical expression, e.g.:
Line 15: Line 14:
 done}}}  done
 
}}}
Line 31: Line 31:
 # prints 0 1 what was the question?}}}  # prints 0 1 what was the question?
 
}}}
Line 42: Line 43:
 set -A array -- one two three four}}}  set -A array -- one two three four
 
}}}
Line 44: Line 46:
Bash also lets you initialize an array using a [:glob:]: Bash also lets you initialize an array using a [[glob]]:
Line 48: Line 50:
 oggs=(*.ogg)}}}  oggs=(*.ogg)
 
}}}
Line 56: Line 59:
 letters=({a..z}) # Bash 3.0 or higher}}}  letters=({a..z}) # Bash 3.0 or higher
 
}}}
Line 60: Line 64:
The `set -f` and `set +f` disable and re-enable [:glob:] expansion, respectively, so that a line like `*` will not be expanded into filenames. In some scripts, `set -f` may be in effect already, and therefore running `set +f` may be undesirable. This is something you must manage properly yourself; there is no easy or elegant way to "store" the glob expansion switch setting and restore it later. (And don't try to say parsing the output of `set -o` is easy, because it's not.) The `set -f` and `set +f` disable and re-enable [[glob]] expansion, respectively, so that a line like `*` will not be expanded into filenames. In some scripts, `set -f` may be in effect already, and therefore running `set +f` may be undesirable. This is something you must manage properly yourself; there is no easy or elegant way to "store" the glob expansion switch setting and restore it later. (And don't try to say parsing the output of `set -o` is easy, because it's not.)
Line 62: Line 66:
If you're trying to populate an array with data from a stream, remember that in most shells, the subcommands of a pipeline are executed in [:SubShell:subshells], so you might need to use something like this: If you're trying to populate an array with data from a stream, remember that in most shells, the subcommands of a pipeline are executed in [[SubShell|subshells]], so you might need to use something like this:
Line 67: Line 71:
 while IFS= read -r arr[i++]; do :; done < <(your command)}}}  while IFS= read -r 'arr[i++]'; do :; done < <(your command)
 if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi
 
}}}
Line 69: Line 75:
See ProcessSubstitution and [:BashFAQ/024:FAQ #24] for more details on that syntax. If you are trying to deal with records that might have embedded newlines, you might be using the NUL character ( \0 ) to delimit the records. In that case, you'll want to use the -d argument to read as well:
 {{{
 # Bash
 unset arr i
 while IFS= read -rd $'\0' 'arr[i++]'; do :; done < <(find . -name '*.ugly' -print0)
 if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi
 }}}
See ProcessSubstitution and [[BashFAQ/024|FAQ #24]] for more details on that syntax.

NOTE: it is necessary to quote the subscripted expression passed to read, so that the square brackets aren't interpreted as [[glob]]s. This is also true for other non-keyword builtins that take a subscripted variable name, such as `let` and `unset`.

NOTE: if, as usual, there is a delimiter (newline and NUL) at the end of the input, the loops will add an empty array element (hence the subsequent `unset`). If the last array element was not empty, the input did not end in a delimiter. Alternatively, one can `read` into a temporary variable and set the array element from the temporary variable inside the loop.

greycat mentioned in IRC that he thought he recalled there being some reason not to use the 'while read arr[i++]' syntax directly, but he couldn't remember the specifics at the time. Perhaps it was one of these two issues.
Line 75: Line 94:
 arr[i++]="new item"}}}  arr[i++]="new item"
 
}}}
Line 82: Line 102:
 arr[${#arr[*]}]="new item"}}}  arr[${#arr[*]}]="new item"
 
}}}
Line 91: Line 112:
 set -A arr -- "${arr[@]}" "new item"}}}  set -A arr -- "${arr[@]}" "new item"
 
}}}
Line 97: Line 119:
 arr+=("new item")}}}  arr+=("new item")
 
}}}
Line 99: Line 122:
For examples of using arrays to hold complex shell commands, see [:BashFAQ/050:FAQ #50] and [:BashFAQ/040:FAQ #40]. For examples of using arrays to hold complex shell commands, see [[BashFAQ/050|FAQ #50]] and [[BashFAQ/040|FAQ #40]].
Line 108: Line 131:
 done}}}  done
 
}}}
Line 116: Line 140:
 printf "%s\n" "${arr[@]}"}}}  printf "%s\n" "${arr[@]}"
 
}}}
Line 124: Line 149:
 # prints x/y/z}}}  # prints x/y/z
 
}}}
Line 133: Line 159:
 # prints 0 1 3 42}}}  # prints 0 1 3 42
 
}}}
Line 135: Line 162:
Bash's [:BashFAQ/073:Parameter Expansions] may be performed on array elements ''en masse'' as well: Bash's [[BashFAQ/073|Parameter Expansions]] may be performed on array elements ''en masse'' as well:
Line 141: Line 168:
 echo "${arr[@]/[aeiou]/}" # prints bc df gh jkl}}}  echo "${arr[@]/[aeiou]/}" # prints bc df gh jkl
 
}}}
Line 150: Line 178:
 echo "${@:(-2):1}" # second-to-last positional parameter}}}  echo "${@:(-2):1}" # second-to-last positional parameter
 
}}}

How can I use array variables?

BASH and KornShell have one-dimensional arrays indexed by a numerical expression, e.g.:

  •  # Bash
     host[0]="micky"
     host[1]="minnie"
     host[2]="goofy"
     i=0
     while (( $i < ${#host[@]} ))
     do
         echo "host number $i is ${host[i++]}"
     done

The indexing always begins with 0.

The awkward expression ${#host[@]} returns the number of elements for the array host. Also noteworthy for BASH is the fact that inside the square brackets, i++ works as a C programmer would expect. The square brackets in an array reference force an ArithmeticExpression. (That shortcut does not work in ksh88.)

BASH and Korn shell arrays are also sparse. Elements may be added and deleted out of sequence.

  •  # Bash/ksh
     arr[0]=0
     arr[1]=1
     arr[2]=2
     arr[42]="what was the question?"
     unset arr[2]
     echo "${arr[*]}"
     # prints 0 1 what was the question?

1. Loading values into an array

It's possible to assign multiple values to an array at once, but the syntax differs across shells.

  •  # Bash
     array=(one two three four)
    
     # Korn
     set -A array -- one two three four

Bash also lets you initialize an array using a glob:

  •  # Bash
     oggs=(*.ogg)

(see also NullGlob), or a substitution of any kind:

  •  # Bash
     words=($sentence)
     set -f; O=$IFS IFS=$'\n' lines=($(< myfile)) IFS=$O; set +f
     letters=({a..z})    # Bash 3.0 or higher

When the arrname=(...) syntax is used, any substitutions inside the parentheses undergo WordSplitting according to the regular shell rules. Thus, in the second example above, if we want the lines of the input file to become individual array elements (even if they contain whitespace), we must set IFS appropriately (in this case: to a newline).

The set -f and set +f disable and re-enable glob expansion, respectively, so that a line like * will not be expanded into filenames. In some scripts, set -f may be in effect already, and therefore running set +f may be undesirable. This is something you must manage properly yourself; there is no easy or elegant way to "store" the glob expansion switch setting and restore it later. (And don't try to say parsing the output of set -o is easy, because it's not.)

If you're trying to populate an array with data from a stream, remember that in most shells, the subcommands of a pipeline are executed in subshells, so you might need to use something like this:

  •  # Bash
     unset arr i
     while IFS= read -r 'arr[i++]'; do :; done < <(your command)
     if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi

If you are trying to deal with records that might have embedded newlines, you might be using the NUL character ( \0 ) to delimit the records. In that case, you'll want to use the -d argument to read as well:

  •  # Bash
     unset arr i
     while IFS= read -rd $'\0' 'arr[i++]'; do :; done < <(find . -name '*.ugly' -print0)
     if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi

See ProcessSubstitution and FAQ #24 for more details on that syntax.

NOTE: it is necessary to quote the subscripted expression passed to read, so that the square brackets aren't interpreted as globs. This is also true for other non-keyword builtins that take a subscripted variable name, such as let and unset.

NOTE: if, as usual, there is a delimiter (newline and NUL) at the end of the input, the loops will add an empty array element (hence the subsequent unset). If the last array element was not empty, the input did not end in a delimiter. Alternatively, one can read into a temporary variable and set the array element from the temporary variable inside the loop.

greycat mentioned in IRC that he thought he recalled there being some reason not to use the 'while read arr[i++]' syntax directly, but he couldn't remember the specifics at the time. Perhaps it was one of these two issues.

If you wish to append data to an existing array, there are several approaches. The most flexible is to keep a separate index variable:

  •  # Bash/ksh93
     arr[i++]="new item"

If you don't want to keep an index variable, but you happen to know that your array is not sparse, then you can use the highest existing index:

  •  # Bash/ksh
     # This will FAIL if the array has holes (is sparse).
     arr[${#arr[*]}]="new item"

If you don't know whether your array is sparse or not, but you don't mind re-indexing the entire array (and also being very slow), then you can use:

  •  # Bash
     arr=("${arr[@]}" "new item")
    
     # Ksh
     set -A arr -- "${arr[@]}" "new item"

If you're in bash 3.1 or higher, then you can use the += operator:

  •  # Bash 3.1
     arr+=("new item")

For examples of using arrays to hold complex shell commands, see FAQ #50 and FAQ #40.

2. Retrieving values from an array

Using array elements en masse is one of the key features. In exactly the same way that "$@" is expanded for positional parameters, "${arr[@]}" is expanded to a list of words, one array element per word. For example,

  •  # Korn/Bash
     for x in "${arr[@]}"; do
       echo "next element is '$x'"
     done

This works even if the elements contain whitespace. You always end up with the same number of words as you have array elements.

If one simply wants to dump the full array, one element per line, this is the simplest approach:

  •  # Bash/ksh
     printf "%s\n" "${arr[@]}"

For more complex array-dumping, "${arr[*]}" will cause the elements to be concatenated together, with the first character of IFS (or a space if IFS isn't set) between them. As it happens, "$*" is expanded the same way for positional parameters.

  •  # Bash
     arr=(x y z)
     IFS=/; echo "${arr[*]}"; unset IFS
     # prints x/y/z

BASH 3.0 added the ability to retrieve the list of index values in an array, rather than just iterating over the elements:

  •  # Bash 3.0 or higher
     arr=(0 1 2 3) arr[42]='what was the question?'
     unset arr[2]
     echo ${!arr[*]}
     # prints 0 1 3 42

Bash's Parameter Expansions may be performed on array elements en masse as well:

  •  # Bash
     arr=(abc def ghi jkl)
     echo "${arr[@]#?}"          # prints bc ef hi kl
     echo "${arr[@]/[aeiou]/}"   # prints bc df gh jkl

Parameter Expansion can also be used to extract elements from an array:

  •  # Bash
     echo "${arr[@]:1:3}"        # three elements starting at #1 (second element)
     echo "${arr[@]:(-2)}"       # last two elements
     echo "${@:(-1)}"            # last positional parameter
     echo "${@:(-2):1}"          # second-to-last positional parameter

The @ array (the array of positional parameters) can be used just like any regularly named array.

BashFAQ/005 (last edited 2024-07-18 13:37:28 by GreyCat)