7160
Comment: example of the other kind of while..read
|
14265
links
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
#pragma section-numbers 2 | |
Line 3: | Line 4: |
This answer assumes you have a basic understanding of what arrays ''are'' in the first place. If you're new to this kind of programming, you may wish to start with [[BashGuide/Arrays|the guide's explanation]]. This page is more detailed and thorough. <<TableOfContents>> === Intro === |
|
Line 5: | Line 12: |
{{{ # Bash host[0]="micky" host[1]="minnie" host[2]="goofy" i=0 while (( $i < ${#host[@]} )) do echo "host number $i is ${host[i++]}" done }}} The indexing always begins with 0. The awkward expression `${#host[@]}` returns the number of elements for the array {{{host}}}. Also noteworthy for BASH is the fact that inside the square brackets, {{{i++}}} works as a C programmer would expect. The square brackets in an array reference force an ArithmeticExpression. (That shortcut does not work in ksh88.) BASH and Korn shell arrays are also ''sparse''. Elements may be added and deleted out of sequence. {{{ # Bash/ksh arr[0]=0 arr[1]=1 arr[2]=2 arr[42]="what was the question?" unset arr[2] echo "${arr[*]}" # prints 0 1 what was the question? }}} |
{{{ # Bash host=(mickey minnie goofy) n=${#host[*]} for ((i=0;i<n;i++)); do echo "host number $i is ${host[i]}" done }}} The indexing always begins with 0, unless you specifically choose otherwise. The awkward expression `${#host[*]}` or `${#host[@]}` returns the number of elements for the array {{{host}}}. (We'll go into more detail on syntax below.) Ksh93, Zsh and Bash 4.0 have [[BashGuide/Arrays#Associative_Arrays|Associative Arrays]] as well. These are not available in Bourne, ash, ksh88 or older bash shells and are not specified by POSIX. POSIX and Bourne shells are not guaranteed to have arrays at all. BASH and Korn shell arrays are ''sparse''. Elements may be added and deleted out of sequence. {{{ # Bash/ksh arr[0]=0 arr[1]=1 arr[2]=2 arr[42]="what was the question?" unset 'arr[2]' echo "${arr[*]}" # prints 0 1 what was the question? }}} You should try to write your code in such a way that it can handle sparse arrays, unless you know in advance that an array will never have holes. |
Line 36: | Line 44: |
Assigning one element at a time is simple, and portable: {{{ # Bash/ksh arr[0]=0 arr[42]='the answer' }}} |
|
Line 38: | Line 54: |
{{{ # Bash array=(one two three four) # Korn set -A array -- one two three four }}} Bash also lets you initialize an array using a [[glob]]: {{{ # Bash oggs=(*.ogg) }}} (see also NullGlob), or a substitution of any kind: {{{ # Bash words=($sentence) set -f; O=$IFS IFS=$'\n' lines=($(< myfile)) IFS=$O; set +f letters=({a..z}) # Bash 3.0 or higher }}} When the `arrname=(...)` syntax is used, any substitutions inside the parentheses undergo WordSplitting according to the regular shell rules. Thus, in the second example above, if we want the lines of the input file to become individual array elements (even if they contain whitespace), we must set IFS appropriately (in this case: to a newline). The `set -f` and `set +f` disable and re-enable [[glob]] expansion, respectively, so that a line like `*` will not be expanded into filenames. In some scripts, `set -f` may be in effect already, and therefore running `set +f` may be undesirable. This is something you must manage properly yourself; there is no easy or elegant way to "store" the glob expansion switch setting and restore it later. (And don't try to say parsing the output of `set -o` is easy, because it's not.) If you're trying to populate an array with data from a stream, remember that in most shells, the subcommands of a pipeline are executed in [[SubShell|subshells]], so you might need to use something like this: {{{ # Bash unset arr i while IFS= read -r 'arr[i++]'; do :; done < <(your command) if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi }}} If you are trying to deal with records that might have embedded newlines, you might be using the NUL character ( \0 ) to delimit the records. In that case, you'll want to use the -d argument to read as well: {{{ # Bash unset arr i while IFS= read -rd $'\0' 'arr[i++]'; do :; done < <(find . -name '*.ugly' -print0) if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi }}} See ProcessSubstitution and [[BashFAQ/024|FAQ #24]] for more details on that syntax. NOTE: it is necessary to quote the subscripted expression passed to read, so that the square brackets aren't interpreted as [[glob]]s. This is also true for other non-keyword builtins that take a subscripted variable name, such as `let` and `unset`. NOTE: if, as usual, there is a delimiter (newline or NUL) at the end of the input, the loops will add an empty array element (hence the subsequent `unset`). If the last array element was not empty, the input did not end in a delimiter. Alternatively, one can `read` into a temporary variable and set the array element from the temporary variable inside the loop: {{{ # Bash/ksh93 unset arr i while IFS= read -r junk; do arr[i++]=$junk; done < <(command) if [[ $junk ]]; then arr[i++]=$junk; fi # or print an error about a missing newline }}} greycat mentioned in IRC that he thought he recalled there being some reason not to use the 'while read arr[i++]' syntax directly, but he couldn't remember the specifics at the time. Perhaps it was one of these two issues. |
{{{ # Bash/ksh93 array=(zero one two three four) # Korn set -A array -- zero one two three four }}} When initializing in this way, the first index will be 0. You can also initialize an array using a [[glob]] (see also NullGlob): {{{ # Bash/ksh93 oggs=(*.ogg) # Korn set -A oggs -- *.ogg }}} or using a substitution of any kind: {{{ # Bash words=($sentence) letters=({a..z}) # Bash 3.0 or higher # Korn set -A words -- $sentence }}} When the `arrname=(...)` syntax is used, any unquoted substitutions inside the parentheses undergo WordSplitting and [[glob]] expansion according to the regular shell rules. In the first example above, if any of the words in `$sentence` contain glob characters, filename expansion may occur. `set -f` and `set +f` may be used to disable and re-enable [[glob]] expansion, respectively, so that words like `*` will not be expanded into filenames. In some scripts, `set -f` may be in effect already, and therefore running `set +f` may be undesirable. This is something you must manage properly yourself; there is no easy or elegant way to "store" the glob expansion switch setting and restore it later. (And don't try to say parsing the output of `set -o` is easy, because it's not.) ==== Loading lines from a file or stream ==== In bash 4, the `mapfile` command (also known as `readarray`) accomplishes this: {{{ # Bash 4 mapfile -t lines < myfile # or mapfile -t lines < <(some command) }}} See ProcessSubstitution and [[BashFAQ/024|FAQ #24]] for more details on the `<()` syntax. `mapfile` handles blank lines (it inserts them as empty array elements), and it also handles missing final newlines from the input stream. Both those things become problematic when reading data in other ways, as we shall see momentarily. `mapfile` does have one serious drawback: it can ''only'' handle newlines as line terminators. It can't, for example, handle NUL-delimited files from `find -print0`. When mapfile is not available, we have to work '''very hard''' to try to duplicate it. There are a great number of ways to ''almost'' get it right, but fail in subtle ways. These examples will duplicate most of `mapfile`'s basic functionality: {{{ # Bash 2.04+, Ksh93 unset lines i while IFS= read -r; do lines[i++]=$REPLY; done < <(your command) # or < file [[ $REPLY ]] && lines[i++]=$REPLY }}} {{{ # Ksh88 unset lines; i=0 while IFS= read -r; do lines[i]=$REPLY; i=$((i+1)); done < file [ "$REPLY" ] && lines[i]=$REPLY i=$((i+1)) }}} Now let's look at some simpler cases that fail, so you can see why we used such a complicated solution. Some people might start out like this: {{{ # These examples only work with certain kinds of input files. # Bash set -f; IFS=$'\n' lines=($(< myfile)); unset IFS; set +f # Ksh set -f; IFS=' '; set -A lines -- $(< myfile); unset IFS; set +f }}} That's a literal newline (and nothing else) between the single quotes in the Korn shell example. We use [[IFS]] (setting it to a newline) because we want each ''line'' of input to become an array element, not each ''word''. However, relying on IFS WordSplitting causes issues if you have repeated whitespace delimiters, because they will be consolidated. E.g., a file with blank lines will have repeated newline characters. If you wanted the blank lines to be stored as empty array elements, IFS's behavior will backfire on you; the blank lines will disappear. There is no clean workaround for this other than to scrap the whole approach. {{{ # bash # \v is a vertical tab and rarely/never used, so we can use it to mark empty lines # additionally bash won't collapse multiple \v . # now empty lines are preserved in array as empty elements set -f; IFS=$'\n\v' eval lines='( $(sed -re 's/^$/\v/' myfile) )'; set +f }}} A second approach would be to read the elements one by one, using a loop. This one does ''not'' work (with normal input; ironically, it works with some degenerate inputs): {{{ # Does not work! unset arr i while IFS= read -r 'arr[i++]'; do :; done < file }}} Why doesn't it work? It puts a blank element at the end of the array, because the `read -r arr[i++]` is executed one extra time after the end of file. However, we'll revisit this approach later. This one gets us much closer: {{{ # Bash unset arr i while read -r; do arr[i++]=$REPLY; done < yourfile # or while read -r; do arr[i++]=$REPLY; done < <(your command) }}} The square brackets create a [[ArithmeticExpression|math context]]. Inside them, `i++` works as a C programmer would expect. (That shortcut works in ksh93, but not in ksh88.) This approach handles blank lines, but it fails if your file or stream is missing its final newline. So we need to handle that case specially: {{{ # Bash unset arr i while read -r; do arr[i++]=$REPLY; done < <(your command) # Append unterminated data line if there was one. [[ $REPLY ]] && arr[i++]=$REPLY }}} This is the "final solution" we gave earlier, handling both blank lines inside the file, and an unterminated final line. Our second try above (the `read -r 'arr[i++]'` one) works great if there's an unterminated line (since the array element is populated with the partial data before the exit status of `read` is checked). Unfortunately, it puts an empty element on the end of the array if the data stream ''is'' correctly terminated. So to fix that one, we need to remove the empty element after the loop: {{{ # Bash unset arr i while IFS= read -r 'arr[i++]'; do :; done < <(your command) # Remove trailing empty element, if any. if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi }}} This is also a working solution. Whether you prefer to read too many and then have to remove one, or read too few and then have to add one, is a personal choice. NOTE: it is necessary to quote the `'arr[i++]'` passed to read, so that the square brackets aren't interpreted as [[glob]]s. This is also true for other non-keyword builtins that take a subscripted variable name, such as `let` and `unset`. ==== Reading NUL-delimited streams ==== If you are trying to deal with records that might have embedded newlines, you will be using an alternative delimiter such as the NUL character ( \0 ) to separate the records. In that case, you'll need to use the `-d` argument to `read` as well: {{{ # Bash unset arr i while IFS= read -rd '' 'arr[i++]'; do :; done < <(find . -name '*.ugly' -print0) if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi # or while read -rd ''; do arr[i++]=$REPLY; done < <(find . -name '*.ugly' -print0) [[ $REPLY ]] && arr[i++]=$REPLY }}} `read -d ''` tells Bash to keep reading until a NUL byte; normally it reads until a newline. There is no equivalent in Korn shell as far as we're aware. ==== Appending to an existing array ==== |
Line 97: | Line 224: |
{{{ # Bash/ksh93 arr[i++]="new item" }}} |
{{{ # Bash/ksh93 arr[i++]="new item" }}} |
Line 104: | Line 231: |
{{{ # Bash/ksh # This will FAIL if the array has holes (is sparse). arr[${#arr[*]}]="new item" }}} |
{{{ # Bash/ksh # This will FAIL if the array has holes (is sparse). arr[${#arr[*]}]="new item" }}} |
Line 112: | Line 239: |
{{{ # Bash arr=("${arr[@]}" "new item") # Ksh set -A arr -- "${arr[@]}" "new item" }}} |
{{{ # Bash arr=("${arr[@]}" "new item") # Ksh set -A arr -- "${arr[@]}" "new item" }}} |
Line 122: | Line 249: |
{{{ # Bash 3.1 arr+=("new item") }}} |
{{{ # Bash 3.1 arr+=("new item") }}} NOTE: the parentheses are required, just as when assigning to an array. (Or you will end up appending to `${arr[0]}` which `$arr` is a synonym for.) |
Line 128: | Line 257: |
Line 130: | Line 260: |
Using array elements ''en masse'' is one of the key features. In exactly the same way that {{{"$@"}}} is expanded for positional parameters, {{{"${arr[@]}"}}} is expanded to a list of words, one array element per word. For example, {{{ # Korn/Bash for x in "${arr[@]}"; do echo "next element is '$x'" done }}} |
`${#arr[*]}` or `${#arr[@]}` gives the number of elements in an array: {{{ # Bash shopt -s nullglob oggs=(*.ogg) echo "There are ${#oggs[*]} Ogg files." }}} `*` is reported to be quicker than `@` when testing on Bash 3. `*` and `@` both seem to work at the same speed when testing on Bash 4.1. Single elements are retrieved by index: {{{ echo "${foo[0]} - ${bar[j+1]}" }}} The square brackets are a [[ArithmeticExpression|math context]]. Arithmetic can be done there, and parameter expansions are done even without `$`. Using array elements ''en masse'' is one of the key features of shell arrays. In exactly the same way that `"$@"` is expanded for positional parameters, `"${arr[@]}"` is expanded to a list of words, one array element per word. For example, {{{ # Korn/Bash for x in "${arr[@]}"; do echo "next element is '$x'" done }}} |
Line 143: | Line 292: |
{{{ # Bash/ksh printf "%s\n" "${arr[@]}" }}} For more complex array-dumping, {{{"${arr[*]}"}}} will cause the elements to be concatenated together, with the first character of {{{IFS}}} (or a space if IFS isn't set) between them. As it happens, {{{"$*"}}} is expanded the same way for positional parameters. {{{ # Bash arr=(x y z) IFS=/; echo "${arr[*]}"; unset IFS # prints x/y/z }}} |
{{{ # Bash/ksh printf "%s\n" "${arr[@]}" }}} For slightly more complex array-dumping, `"${arr[*]}"` will cause the elements to be concatenated together, with the first character of [[IFS]] (or a space if IFS isn't set) between them. As it happens, `"$*"` is expanded the same way for positional parameters. {{{ # Bash arr=(x y z) IFS=/; echo "${arr[*]}"; unset IFS # prints x/y/z }}} Unfortunately, you can't put multiple characters in between array elements using that syntax. You would have to do something like this instead: {{{ # Bash/ksh arr=(x y z) tmp=$(printf "%s<=>" "${arr[@]}") echo "${tmp%<=>}" # Remove the extra <=> from the end. # prints x<=>y<=>z }}} |
Line 159: | Line 318: |
{{{ # Bash 3.0 or higher arr=(0 1 2 3) arr[42]='what was the question?' unset arr[2] echo ${!arr[*]} # prints 0 1 3 42 }}} Bash's [[BashFAQ/073|Parameter Expansions]] may be performed on array elements ''en masse'' as well: {{{ # Bash arr=(abc def ghi jkl) echo "${arr[@]#?}" # prints bc ef hi kl echo "${arr[@]/[aeiou]/}" # prints bc df gh jkl }}} Parameter Expansion can also be used to extract elements from an array: {{{ # Bash echo "${arr[@]:1:3}" # three elements starting at #1 (second element) echo "${arr[@]:(-2)}" # last two elements echo "${@:(-1)}" # last positional parameter echo "${@:(-2):1}" # second-to-last positional parameter }}} The {{{@}}} array (the array of positional parameters) can be used just like any regularly named array. |
{{{ # Bash 3.0 or higher arr=(0 1 2 3) arr[42]='what was the question?' unset 'arr[2]' echo "${!arr[@]}" # prints 0 1 3 42 }}} Retrieving the indices is extremely important in certain kinds of tasks, such as maintaining parallel arrays with the same indices (a cheap way to mimic having an array of `struct`s in a language with no `struct`): {{{ # Bash 3.0 or higher unset file title artist i for f in ./*.mp3; do file[i]=$f title[i]=$(mp3info -p %t "$f") artist[i++]=$(mp3info -p %a "$f") done # Later, iterate over every song. # This works even if the arrays are sparse, just so long as they all have # the SAME holes. for i in "${!file[@]}"; do echo "${file[i]} is ${title[i]} by ${artist[i]}" done }}} ==== Retrieving with modifications ==== Bash's [[BashFAQ/073|Parameter Expansions]] may be performed on array elements ''en masse'': {{{ # Bash arr=(abc def ghi jkl) echo "${arr[@]#?}" # prints bc ef hi kl echo "${arr[@]/[aeiou]/}" # prints bc df gh jkl }}} Parameter Expansion can also be used to extract elements from an array. Some people call this ''slicing'': {{{ # Bash echo "${arr[@]:1:3}" # three elements starting at #1 (second element) echo "${arr[@]:(-2)}" # last two elements echo "${@:(-1)}" # last positional parameter echo "${@:(-2):1}" # second-to-last positional parameter }}} === Using @ as a pseudo-array === As we see above, the `@` array (the array of positional parameters) can be used almost like a regularly named array. This is the ''only'' array available for use in POSIX or Bourne shells. It has certain limitations: you cannot individually set or unset single elements, and it cannot be sparse. Nevertheless, it still makes certain POSIX shell tasks possible that would otherwise require external tools: {{{ # POSIX set -- *.mp3 if [ -e "$1" ]; then echo "there are $# MP3 files" else echo "there are 0 MP3 files" fi }}} {{{ # POSIX ... # Add an option to our dynamically generated list of options set -- "$@" -f "$somefile" ... foocommand "$@" }}} (Compare to [[BashFAQ/050|FAQ #50]]'s dynamically generated commands using named arrays.) == See Also == * [[http://wiki.bash-hackers.org/syntax/arrays|Bash-hackers array documentation]] * [[BashGuide/Arrays]] * [[BashSheet#Arrays|BashSheet Array reference]] * [[BashFAQ/006#Associative_Arrays|BashFAQ 6 - explaining associative arrays]] ---- CategoryShell |
How can I use array variables?
This answer assumes you have a basic understanding of what arrays are in the first place. If you're new to this kind of programming, you may wish to start with the guide's explanation. This page is more detailed and thorough.
Contents
1. Intro
BASH and KornShell have one-dimensional arrays indexed by a numerical expression, e.g.:
# Bash host=(mickey minnie goofy) n=${#host[*]} for ((i=0;i<n;i++)); do echo "host number $i is ${host[i]}" done
The indexing always begins with 0, unless you specifically choose otherwise. The awkward expression ${#host[*]} or ${#host[@]} returns the number of elements for the array host. (We'll go into more detail on syntax below.)
Ksh93, Zsh and Bash 4.0 have Associative Arrays as well. These are not available in Bourne, ash, ksh88 or older bash shells and are not specified by POSIX.
POSIX and Bourne shells are not guaranteed to have arrays at all.
BASH and Korn shell arrays are sparse. Elements may be added and deleted out of sequence.
# Bash/ksh arr[0]=0 arr[1]=1 arr[2]=2 arr[42]="what was the question?" unset 'arr[2]' echo "${arr[*]}" # prints 0 1 what was the question?
You should try to write your code in such a way that it can handle sparse arrays, unless you know in advance that an array will never have holes.
2. Loading values into an array
Assigning one element at a time is simple, and portable:
# Bash/ksh arr[0]=0 arr[42]='the answer'
It's possible to assign multiple values to an array at once, but the syntax differs across shells.
# Bash/ksh93 array=(zero one two three four) # Korn set -A array -- zero one two three four
When initializing in this way, the first index will be 0.
You can also initialize an array using a glob (see also NullGlob):
# Bash/ksh93 oggs=(*.ogg) # Korn set -A oggs -- *.ogg
or using a substitution of any kind:
# Bash words=($sentence) letters=({a..z}) # Bash 3.0 or higher # Korn set -A words -- $sentence
When the arrname=(...) syntax is used, any unquoted substitutions inside the parentheses undergo WordSplitting and glob expansion according to the regular shell rules. In the first example above, if any of the words in $sentence contain glob characters, filename expansion may occur.
set -f and set +f may be used to disable and re-enable glob expansion, respectively, so that words like * will not be expanded into filenames. In some scripts, set -f may be in effect already, and therefore running set +f may be undesirable. This is something you must manage properly yourself; there is no easy or elegant way to "store" the glob expansion switch setting and restore it later. (And don't try to say parsing the output of set -o is easy, because it's not.)
2.1. Loading lines from a file or stream
In bash 4, the mapfile command (also known as readarray) accomplishes this:
# Bash 4 mapfile -t lines < myfile # or mapfile -t lines < <(some command)
See ProcessSubstitution and FAQ #24 for more details on the <() syntax.
mapfile handles blank lines (it inserts them as empty array elements), and it also handles missing final newlines from the input stream. Both those things become problematic when reading data in other ways, as we shall see momentarily.
mapfile does have one serious drawback: it can only handle newlines as line terminators. It can't, for example, handle NUL-delimited files from find -print0.
When mapfile is not available, we have to work very hard to try to duplicate it. There are a great number of ways to almost get it right, but fail in subtle ways.
These examples will duplicate most of mapfile's basic functionality:
# Bash 2.04+, Ksh93 unset lines i while IFS= read -r; do lines[i++]=$REPLY; done < <(your command) # or < file [[ $REPLY ]] && lines[i++]=$REPLY
# Ksh88 unset lines; i=0 while IFS= read -r; do lines[i]=$REPLY; i=$((i+1)); done < file [ "$REPLY" ] && lines[i]=$REPLY i=$((i+1))
Now let's look at some simpler cases that fail, so you can see why we used such a complicated solution.
Some people might start out like this:
# These examples only work with certain kinds of input files. # Bash set -f; IFS=$'\n' lines=($(< myfile)); unset IFS; set +f # Ksh set -f; IFS=' '; set -A lines -- $(< myfile); unset IFS; set +f
That's a literal newline (and nothing else) between the single quotes in the Korn shell example.
We use IFS (setting it to a newline) because we want each line of input to become an array element, not each word.
However, relying on IFS WordSplitting causes issues if you have repeated whitespace delimiters, because they will be consolidated. E.g., a file with blank lines will have repeated newline characters. If you wanted the blank lines to be stored as empty array elements, IFS's behavior will backfire on you; the blank lines will disappear. There is no clean workaround for this other than to scrap the whole approach.
# bash # \v is a vertical tab and rarely/never used, so we can use it to mark empty lines # additionally bash won't collapse multiple \v . # now empty lines are preserved in array as empty elements set -f; IFS=$'\n\v' eval lines='( $(sed -re 's/^$/\v/' myfile) )'; set +f
A second approach would be to read the elements one by one, using a loop. This one does not work (with normal input; ironically, it works with some degenerate inputs):
# Does not work! unset arr i while IFS= read -r 'arr[i++]'; do :; done < file
Why doesn't it work? It puts a blank element at the end of the array, because the read -r arr[i++] is executed one extra time after the end of file. However, we'll revisit this approach later.
This one gets us much closer:
# Bash unset arr i while read -r; do arr[i++]=$REPLY; done < yourfile # or while read -r; do arr[i++]=$REPLY; done < <(your command)
The square brackets create a math context. Inside them, i++ works as a C programmer would expect. (That shortcut works in ksh93, but not in ksh88.)
This approach handles blank lines, but it fails if your file or stream is missing its final newline. So we need to handle that case specially:
# Bash unset arr i while read -r; do arr[i++]=$REPLY; done < <(your command) # Append unterminated data line if there was one. [[ $REPLY ]] && arr[i++]=$REPLY
This is the "final solution" we gave earlier, handling both blank lines inside the file, and an unterminated final line.
Our second try above (the read -r 'arr[i++]' one) works great if there's an unterminated line (since the array element is populated with the partial data before the exit status of read is checked). Unfortunately, it puts an empty element on the end of the array if the data stream is correctly terminated. So to fix that one, we need to remove the empty element after the loop:
# Bash unset arr i while IFS= read -r 'arr[i++]'; do :; done < <(your command) # Remove trailing empty element, if any. if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi
This is also a working solution. Whether you prefer to read too many and then have to remove one, or read too few and then have to add one, is a personal choice.
NOTE: it is necessary to quote the 'arr[i++]' passed to read, so that the square brackets aren't interpreted as globs. This is also true for other non-keyword builtins that take a subscripted variable name, such as let and unset.
2.2. Reading NUL-delimited streams
If you are trying to deal with records that might have embedded newlines, you will be using an alternative delimiter such as the NUL character ( \0 ) to separate the records. In that case, you'll need to use the -d argument to read as well:
# Bash unset arr i while IFS= read -rd '' 'arr[i++]'; do :; done < <(find . -name '*.ugly' -print0) if [[ ${arr[i-1]} = "" ]]; then unset 'arr[--i]'; fi # or while read -rd ''; do arr[i++]=$REPLY; done < <(find . -name '*.ugly' -print0) [[ $REPLY ]] && arr[i++]=$REPLY
read -d '' tells Bash to keep reading until a NUL byte; normally it reads until a newline. There is no equivalent in Korn shell as far as we're aware.
2.3. Appending to an existing array
If you wish to append data to an existing array, there are several approaches. The most flexible is to keep a separate index variable:
# Bash/ksh93 arr[i++]="new item"
If you don't want to keep an index variable, but you happen to know that your array is not sparse, then you can use the highest existing index:
# Bash/ksh # This will FAIL if the array has holes (is sparse). arr[${#arr[*]}]="new item"
If you don't know whether your array is sparse or not, but you don't mind re-indexing the entire array (and also being very slow), then you can use:
# Bash arr=("${arr[@]}" "new item") # Ksh set -A arr -- "${arr[@]}" "new item"
If you're in bash 3.1 or higher, then you can use the += operator:
# Bash 3.1 arr+=("new item")
NOTE: the parentheses are required, just as when assigning to an array. (Or you will end up appending to ${arr[0]} which $arr is a synonym for.)
For examples of using arrays to hold complex shell commands, see FAQ #50 and FAQ #40.
3. Retrieving values from an array
${#arr[*]} or ${#arr[@]} gives the number of elements in an array:
# Bash shopt -s nullglob oggs=(*.ogg) echo "There are ${#oggs[*]} Ogg files."
* is reported to be quicker than @ when testing on Bash 3. * and @ both seem to work at the same speed when testing on Bash 4.1.
Single elements are retrieved by index:
echo "${foo[0]} - ${bar[j+1]}"
The square brackets are a math context. Arithmetic can be done there, and parameter expansions are done even without $.
Using array elements en masse is one of the key features of shell arrays. In exactly the same way that "$@" is expanded for positional parameters, "${arr[@]}" is expanded to a list of words, one array element per word. For example,
# Korn/Bash for x in "${arr[@]}"; do echo "next element is '$x'" done
This works even if the elements contain whitespace. You always end up with the same number of words as you have array elements.
If one simply wants to dump the full array, one element per line, this is the simplest approach:
# Bash/ksh printf "%s\n" "${arr[@]}"
For slightly more complex array-dumping, "${arr[*]}" will cause the elements to be concatenated together, with the first character of IFS (or a space if IFS isn't set) between them. As it happens, "$*" is expanded the same way for positional parameters.
# Bash arr=(x y z) IFS=/; echo "${arr[*]}"; unset IFS # prints x/y/z
Unfortunately, you can't put multiple characters in between array elements using that syntax. You would have to do something like this instead:
# Bash/ksh arr=(x y z) tmp=$(printf "%s<=>" "${arr[@]}") echo "${tmp%<=>}" # Remove the extra <=> from the end. # prints x<=>y<=>z
BASH 3.0 added the ability to retrieve the list of index values in an array, rather than just iterating over the elements:
# Bash 3.0 or higher arr=(0 1 2 3) arr[42]='what was the question?' unset 'arr[2]' echo "${!arr[@]}" # prints 0 1 3 42
Retrieving the indices is extremely important in certain kinds of tasks, such as maintaining parallel arrays with the same indices (a cheap way to mimic having an array of structs in a language with no struct):
# Bash 3.0 or higher unset file title artist i for f in ./*.mp3; do file[i]=$f title[i]=$(mp3info -p %t "$f") artist[i++]=$(mp3info -p %a "$f") done # Later, iterate over every song. # This works even if the arrays are sparse, just so long as they all have # the SAME holes. for i in "${!file[@]}"; do echo "${file[i]} is ${title[i]} by ${artist[i]}" done
3.1. Retrieving with modifications
Bash's Parameter Expansions may be performed on array elements en masse:
# Bash arr=(abc def ghi jkl) echo "${arr[@]#?}" # prints bc ef hi kl echo "${arr[@]/[aeiou]/}" # prints bc df gh jkl
Parameter Expansion can also be used to extract elements from an array. Some people call this slicing:
# Bash echo "${arr[@]:1:3}" # three elements starting at #1 (second element) echo "${arr[@]:(-2)}" # last two elements echo "${@:(-1)}" # last positional parameter echo "${@:(-2):1}" # second-to-last positional parameter
4. Using @ as a pseudo-array
As we see above, the @ array (the array of positional parameters) can be used almost like a regularly named array. This is the only array available for use in POSIX or Bourne shells. It has certain limitations: you cannot individually set or unset single elements, and it cannot be sparse. Nevertheless, it still makes certain POSIX shell tasks possible that would otherwise require external tools:
# POSIX set -- *.mp3 if [ -e "$1" ]; then echo "there are $# MP3 files" else echo "there are 0 MP3 files" fi
# POSIX ... # Add an option to our dynamically generated list of options set -- "$@" -f "$somefile" ... foocommand "$@"
(Compare to FAQ #50's dynamically generated commands using named arrays.)