Differences between revisions 45 and 77 (spanning 32 versions)
Revision 45 as of 2012-11-30 17:13:10
Size: 18502
Editor: ormaaj
Comment: read -d '' finally fixed.
Revision 77 as of 2023-03-25 22:39:06
Size: 20177
Editor: emanuele6
Comment: fix POSIX [ -e "$firstglobresult" ] check; it should also check [ -L ]
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Line 10: Line 9:
One-dimensional integer-indexed arrays are implemented by Bash, Zsh, and most KornShell varieties including AT&T ksh88 or later, mksh, and pdksh. Arrays are not specified by POSIX and not available in legacy or minimalist shells such as BourneShell and Dash. The POSIX-compatible shells that do feature arrays mostly agree on their basic principles, but there are some significant differences in the details. Advanced users of multiple shells should be sure to research the specifics. Ksh93, Zsh, and Bash 4.0 additionally have [[BashGuide/Arrays#Associative_Arrays|Associative Arrays]]. This article focuses on indexed arrays as they are the most common and useful type.

Here is a typical usage pattern featuring an array named {{{host}}}:

{{{
One-dimensional integer-indexed arrays are implemented by Bash, Zsh, and most KornShell varieties including AT&T ksh88 or later, mksh, and pdksh. Arrays are not specified by POSIX and not available in legacy or minimalist shells such as BourneShell and Dash. The POSIX-compatible shells that do feature arrays mostly agree on their basic principles, but there are some significant differences in the details. Advanced users of multiple shells should be sure to research the specifics. Ksh93, Zsh, and Bash 4.0 additionally have [[BashGuide/Arrays#Associative_Arrays|Associative Arrays]] (see also [[BashFAQ/006|FAQ 6]]). This article focuses on indexed arrays as they are the most common type.

Basic syntax summary (for bash, math indexed arrays):
||`a=(word1 word2 "$word3" ...)`||Initialize an array from a word list, indexed starting with 0 unless otherwise specified.||
||`a=(*.png *.jpg)`||Initialize an array with filenames.||
||`a[i]=word`||Set one element to `word`, evaluating the value of `i` in a math context to determine the index.||
||`a[i+1]=word`||Set one element, demonstrating that the index is also a math context.||
||`a[i]+=suffix`||Append `suffix` to the previous value of `a[i]` (bash 3.1).||
||`a+=(word ...)` # append||<|2> Modify an existing array without unsetting it, indexed starting at one greater than the highest indexed element unless otherwise specified (bash 3.1).||
||`a+=([3]=word3 word4 [i]+=word_i_suffix)`<<BR>># modify (ormaaj example)||
||`unset 'a[i]'`||Unset one element. Note the mandatory quotes (`a[i]` is a valid [[glob]]).||
||`"${a[i]}"`||Reference one element.||
||`"$(( a[i] + 5 ))"`||Reference one element, in a math context.||
||`"${a[@]}"`||Expand all elements as a list of words.||
||`"${!a[@]}"`||Expand all ''indices'' as a list of words (bash 3.0).||
||`"${a[*]}"`||Expand all elements as a ''single'' word, with the first char of [[IFS]] as separator.||
||`"${#a[@]}"`||Number of elements (size, length).||
||`"${a[@]:start:len}"`||Expand a range of elements as a list of words, cf. [[BashFAQ/100#Extracting_parts_of_strings|string range]].||
||`"${a[@]#trimstart}"` `"${a[@]%trimend}"`<<BR>>`"${a[@]//search/repl}"` etc.||Expand all elements as a list of words, with modifications applied to each element separately.||
||`declare -p a`||Show/dump the array, in a bash-reusable form.||
||`mapfile -t a < stream`||Initialize an array from a stream (bash 4.0).||
||`readarray -t a < stream`||Same as mapfile.||
||`"$a"`||Same as `"${a[0]}"`. '''Does NOT''' expand to the entire array. This usage is considered '''confusing''' at best, but is usually a '''bug'''.||

Here is a typical usage pattern featuring an array named `host`:

{{{#!highlight bash
Line 22: Line 43:
    printf 'Host number %d is %s' "$idx" "${host[idx]}"     printf 'Host number %d is %s\n' "$idx" "${host[idx]}"
Line 25: Line 46:
`"${!host[@]}"` expands to the indices of of the {{{host}}} array, each as a separate argument. (We'll go into more detail on syntax below.)
`"${!host[@]}"` expands to the indices of of the `host` array, each as a separate word.
Line 29: Line 51:
{{{ {{{#!highlight bash
Line 38: Line 60:
# Unset the seceond element of "arr" # Unset the second element of "arr"
Line 45: Line 67:
Line 50: Line 73:
{{{ {{{#!highlight bash
Line 55: Line 78:
It's possible to assign multiple values to an array at once, but the syntax differs across shells. Bash supports only the {{{arrName=(args...)}}} syntax. ksh88 supports only the {{{set -A arrName -- args...}}} syntax. ksh93, mksh, and zsh support both. There are subtle differences in both methods between all of these shells if you look closely.

{{{

It's possible to assign multiple values to an array at once, but the syntax differs across shells. Bash supports only the `arrName=(args...)` syntax. ksh88 supports only the `set -A arrName -- args...` syntax. ksh93, mksh, and zsh support both. There are subtle differences in both methods between all of these shells if you look closely.

{{{#!highlight bash
Line 61: Line 85:
{{{
{{{#!highlight bash
Line 65: Line 90:
Line 69: Line 95:
{{{ {{{#!highlight bash
Line 73: Line 99:
With ksh88-style assignment using {{{set}}}, the arguments are just ordinary arguments to a command.

{{{

With ksh88-style assignment using `set`, the arguments are just ordinary arguments to a command.

{{{#!highlight bash
Line 79: Line 106:
{{{
{{{#!highlight bash
Line 84: Line 112:
{{{
{{{#!highlight bash
Line 88: Line 117:
Line 91: Line 121:
{{{ {{{#!highlight bash
Line 98: Line 128:
Line 100: Line 131:
`mapfile` handles blank lines by inserting them as empty array elements, and also missing final newlines from the input stream. These can be problematic when reading data in other ways (see the next section). `mapfile` does have one serious drawback: it can ''only'' handle newlines as line terminators. Not all options supported by `read` are handled by `mapfile, and visa-versa. `mapfile` can't, for example, handle NUL-delimited files from `find -print0`. When mapfile isn't available, we have to work '''very hard''' to try to duplicate it. There are a great number of ways to ''almost'' get it right, but fail in subtle ways.

These examples will duplicate most of `mapfile`'s basic functionality:

{{{
# Bash, Ksh93, mksh
`mapfile` handles blank lines by inserting them as empty array elements, and (with `-t`) also silently appends a missing final newline if the input stream lacks one. These can be problematic when reading data in other ways (see the next section). `mapfile` in bash 4.0 through 4.3 does have one serious drawback: it can ''only'' handle newlines as line terminators. Bash 4.4 adds the `-d` option to supply a different line delimiter.

When mapfile isn't available, we have to work '''very hard''' to try to duplicate it. There are a great number of ways to ''almost'' get it right, but many of them fail in subtle ways.

The following examples will duplicate most of `mapfile`'s basic functionality in older shells. '''You can skip all of these alternative examples if you have bash 4.'''

{{{#!highlight bash
# Alternative: Bash 3.1, Ksh93, mksh
unset -v lines
Line 111: Line 145:
Line 113: Line 148:
{{{
# Korn
{{{#!highlight bash
# Alternative: ksh88
Line 117: Line 152:
unset -v lines
Line 118: Line 154:
    lines[i+=1,$i]=$REPLY     lines[i+=1,$i]=$REPLY     # Mimics lines[i++]=$REPLY
Line 122: Line 158:
Line 127: Line 164:
To be clear - most text files ''should'' contain a newline as the last character in the file. Newlines are added to the ends of files by most text editors, and also by [[HereDocument|Here documents]] and [[HereStromg|Here strings]]. Most of the time, this is only an issue when reading output from pipes or process substitutions, or from "broken" text files created with broken or misconfigured tools. Let's look at some examples. To be clear - text files ''should'' contain a newline as the last character in the file. Newlines are added to the ends of files by most text editors, and also by [[HereDocument|Here documents]] and [[HereStromg|Here strings]]. Most of the time, this is only an issue when reading output from pipes or process substitutions, or from "broken" text files created with broken or misconfigured tools. Let's look at some examples.
Line 131: Line 168:
{{{ {{{#!highlight bash
Line 138: Line 175:
Line 140: Line 178:
{{{ {{{#!highlight bash
Line 147: Line 185:
Line 151: Line 190:
{{{
# Bash, ksh93, mksh
{{{#!highlight bash
# Alternative: Bash, ksh93, mksh
Line 159: Line 198:
Line 163: Line 203:
{{{
# Bash
{{{#!highlight bash
# Alternative: Bash
Line 173: Line 213:
Line 180: Line 221:
{{{ {{{#!highlight bash
Line 182: Line 223:
    IFS=$'\n' read -rd '' -a lines <file
}}}
{{{
# mksh, zsh
    IFS=$'\n' read -rd '' -A lines <file
}}}
--(Unfortunately, the above doesn't work in ksh93, even though its `read` does have the ''-d'' delimiter flag. Of course, the above examples do not preserve blank lines, but they are a quick easy `mapfile` replacement that also works in a few non-bash shells.)-- Fixed as of ksh 93v- alpha 2012-10-12
{{{
12-10-09 +read -d '' now reads up to a NUL byte.
IFS=$'\n' read -rd '' -a lines <file
}}}

{{{#!highlight bash
# mksh, zsh
IFS=$'\n' read -rd '' -A lines <file
Line 194: Line 232:
'''[[DontReadLinesWithFor|NEVER READ LINES USING for..in LOOPS]]!''' Relying on [[IFS]] WordSplitting causes issues if you have repeated whitespace delimiters, because they will be consolidated. It is not possible to preserve blank lines by having them stored as empty array elements this way. Even worse, special globbing chracters will be expanded without going to lengths to disable and then re-enable it. Just never use this approach - it is problematic, the workarounds are all ugly, and not all problems are solvable.

Because this is such an incredibly common mistake, below illustrates close to the best possible version of this hack, and how much harder it is than just doing it correctly -- and it still can't preserve consecutive newlines! It only gets worse from here. See DontReadLinesWithFor for details.

{{{
# Bash
# WARNING: Don't do this!

evilReadLines() {
    [[ -e $2 ]] || return

    # Try hard to preserve the previous glob and trap states.
    # But if the caller sets ERR or DEBUG, we're still in trouble!
    if [[ $- != *f* ]]; then
        set -f
        local oReturn=$(trap -p RETURN)
        trap 'set +f; trap "${oReturn:--}" RETURN' RETURN
    fi

    local line idx IFS=$'\n'
    for line in ${1:+$(<"$2")}; do
        printf -v "${1}[idx++]" %s "$line"
    done

    # This is an equally bad alternative to the above for loop, albeit slightly faster:
    # IFS=$'\n' declare -a ${1:+"$1"'=( $(<"$2") )'} 2>/dev/null
}
declare -ft evilReadLines # Inherit traps from the caller.

# Pass in an array name and file name
evilReadLines myArray myFile
}}}
'''[[DontReadLinesWithFor|Never read lines using for..in loops]]!''' Relying on [[IFS]] WordSplitting causes issues if you have repeated whitespace delimiters, because they will be consolidated. It is not possible to preserve blank lines by having them stored as empty array elements this way. Even worse, special globbing characters will be expanded without going to lengths to disable and then re-enable it. Just never use this approach - it is problematic, the workarounds are all ugly, and not all problems are solvable.
Line 227: Line 235:
If you are trying to deal with records that might have embedded newlines, you will be using an alternative delimiter such as the NUL character ( \0 ) to separate the records. In that case, you'll need to use the `-d` argument to `read` as well:

{{{
# Bash
unset -v arr i
while IFS= read -rd '' 'arr[i++]'; do
    :
done
< <(find . -name '*.ugly' -print0)

# or
If you are trying to deal with records that might have embedded newlines, you will be using an alternative delimiter such as the NUL character ( \0 ) to separate the records. In bash 4.4, you can simply use `mapfile -t -d ''`:

{{{#!highlight bash
#
Bash 4.4
mapfile -t -d '' files
< <(find . -name '*.ugly' -print0)
}}}

Otherwise, you'll need to use the `-d` argument to `read` inside a loop:

{{{
#!highlight bash
# Bash
Line 246: Line 255:
`read -d ''` tells Bash to keep reading until a NUL byte instead of until a newline. This isn't certain to work in all shells with a `-d` feature. 
`read -d ''` tells Bash to keep reading until a NUL byte instead of until a newline. This isn't certain to work in all shells with a `-d` feature.

If you choose to give a variable name to `read` instead of using `REPLY` then also be sure to set `IFS=` for the `read` command, to avoid trimming leading/trailing IFS whitespace.
Line 253: Line 265:
{{{ {{{#!highlight bash
Line 257: Line 269:
Line 259: Line 272:
{{{ {{{#!highlight bash
Line 264: Line 277:
Line 266: Line 280:
{{{ {{{#!highlight bash
Line 275: Line 289:
{{{ {{{#!highlight bash
Line 279: Line 293:
Line 286: Line 301:
{{{ {{{#!highlight bash
Line 292: Line 307:
Line 294: Line 310:
{{{ {{{#!highlight bash
Line 297: Line 313:
Line 299: Line 316:
{{{ {{{#!highlight bash
Line 302: Line 319:
Line 306: Line 324:
{{{ {{{#!highlight bash
Line 312: Line 330:
Line 316: Line 335:
{{{ {{{#!highlight bash
Line 320: Line 339:
Line 322: Line 342:
{{{ {{{#!highlight bash
Line 325: Line 345:
IFS=/; echo "${arr[*]}"; unset IFS IFS=/; echo "${arr[*]}"; unset -v IFS
Line 328: Line 348:
Line 330: Line 351:
{{{ {{{#!highlight bash
Line 337: Line 358:
Line 339: Line 361:
{{{ {{{#!highlight bash
Line 345: Line 367:
Line 349: Line 372:
{{{ {{{#!highlight bash
Line 352: Line 375:
unset 'arr[2]' unset -v 'arr[2]'
Line 356: Line 379:
Line 358: Line 382:
{{{ {{{#!highlight bash
Line 360: Line 384:
unset file title artist i unset -v file title artist i
Line 374: Line 398:
Line 377: Line 402:
{{{ {{{#!highlight bash
Line 383: Line 408:
Parameter Expansion can also be used to extract elements from an array. Some people call this ''slicing'':

{{{

Parameter Expansion can also be used to extract sub-lists of elements from an array. Some people call this ''slicing'':

{{{#!highlight bash
Line 389: Line 415:
echo "${@:(-1)}" # last positional parameter
echo "${@:(-2):1}" # second-to-last positional parameter
}}}
}}}

The same goes for positional parameters

{{{#!highlight bash
set -- foo bar baz
echo "${@:(-1)}" # last positional parameter baz
echo "${@:(-2):1}" # second-to-last positional parameter bar
}}}
Line 395: Line 428:
{{{ {{{#!highlight bash
Line 398: Line 431:
if [ -e "$1" ]; then if [ -e "$1" ] || [ -L "$1" ]; then
Line 404: Line 437:
{{{
{{{#!highlight bash
Line 412: Line 446:

How can I use array variables?

This answer assumes you have a basic understanding of what arrays are. If you're new to this kind of programming, you may wish to start with the guide's explanation. This page is more thorough. See links at the bottom for more resources.

1. Intro

One-dimensional integer-indexed arrays are implemented by Bash, Zsh, and most KornShell varieties including AT&T ksh88 or later, mksh, and pdksh. Arrays are not specified by POSIX and not available in legacy or minimalist shells such as BourneShell and Dash. The POSIX-compatible shells that do feature arrays mostly agree on their basic principles, but there are some significant differences in the details. Advanced users of multiple shells should be sure to research the specifics. Ksh93, Zsh, and Bash 4.0 additionally have Associative Arrays (see also FAQ 6). This article focuses on indexed arrays as they are the most common type.

Basic syntax summary (for bash, math indexed arrays):

a=(word1 word2 "$word3" ...)

Initialize an array from a word list, indexed starting with 0 unless otherwise specified.

a=(*.png *.jpg)

Initialize an array with filenames.

a[i]=word

Set one element to word, evaluating the value of i in a math context to determine the index.

a[i+1]=word

Set one element, demonstrating that the index is also a math context.

a[i]+=suffix

Append suffix to the previous value of a[i] (bash 3.1).

a+=(word ...) # append

Modify an existing array without unsetting it, indexed starting at one greater than the highest indexed element unless otherwise specified (bash 3.1).

a+=([3]=word3 word4 [i]+=word_i_suffix)
# modify (ormaaj example)

unset 'a[i]'

Unset one element. Note the mandatory quotes (a[i] is a valid glob).

"${a[i]}"

Reference one element.

"$(( a[i] + 5 ))"

Reference one element, in a math context.

"${a[@]}"

Expand all elements as a list of words.

"${!a[@]}"

Expand all indices as a list of words (bash 3.0).

"${a[*]}"

Expand all elements as a single word, with the first char of IFS as separator.

"${#a[@]}"

Number of elements (size, length).

"${a[@]:start:len}"

Expand a range of elements as a list of words, cf. string range.

"${a[@]#trimstart}" "${a[@]%trimend}"
"${a[@]//search/repl}" etc.

Expand all elements as a list of words, with modifications applied to each element separately.

declare -p a

Show/dump the array, in a bash-reusable form.

mapfile -t a < stream

Initialize an array from a stream (bash 4.0).

readarray -t a < stream

Same as mapfile.

"$a"

Same as "${a[0]}". Does NOT expand to the entire array. This usage is considered confusing at best, but is usually a bug.

Here is a typical usage pattern featuring an array named host:

   1 # Bash
   2 
   3 # Assign the values "mickey", "minnie", and "goofy" to sequential indexes starting with zero.
   4 host=(mickey minnie goofy)
   5 
   6 # Iterate over the indexes of "host".
   7 for idx in "${!host[@]}"; do
   8     printf 'Host number %d is %s\n' "$idx" "${host[idx]}"
   9 done

"${!host[@]}" expands to the indices of of the host array, each as a separate word.

Indexed arrays are sparse, and elements may be inserted and deleted out of sequence.

   1 # Bash/ksh
   2 
   3 # Simple assignment syntax.
   4 arr[0]=0
   5 arr[2]=2
   6 arr[1]=1
   7 arr[42]='what was the question?'
   8 
   9 # Unset the second element of "arr"
  10 unset -v 'arr[2]'
  11 
  12 # Concatenate the values, to a single argument separated by spaces, and echo the result.
  13 echo "${arr[*]}"
  14 # outputs: "0 1 what was the question?"

It is good practice to write your code in such a way that it can handle sparse arrays, even if you think you can guarantee that there will never be any "holes". Only treat arrays as "lists" if you're certain, and the savings in complexity is significant enough for it to be justified.

2. Loading values into an array

Assigning one element at a time is simple, and portable:

   1 # Bash/ksh
   2 arr[0]=0
   3 arr[42]='the answer'

It's possible to assign multiple values to an array at once, but the syntax differs across shells. Bash supports only the arrName=(args...) syntax. ksh88 supports only the set -A arrName -- args... syntax. ksh93, mksh, and zsh support both. There are subtle differences in both methods between all of these shells if you look closely.

   1 # Bash, ksh93, mksh, zsh
   2 array=(zero one two three four)

   1 # ksh88/93, mksh, zsh
   2 set -A array -- zero one two three four

When initializing in this way, the first index will be 0 unless a different index is specified.

With compound assignment, the space between the parentheses is evaluated in the same way as the arguments to a command, including pathname expansion and WordSplitting. Any type of expansion or substitution may be used. All the usual quoting rules apply within.

   1 # Bash/ksh93
   2 oggs=(*.ogg)

With ksh88-style assignment using set, the arguments are just ordinary arguments to a command.

   1 # Korn
   2 set -A oggs -- *.ogg

   1 # Bash (brace expansion requires 3.0 or higher)
   2 homeDirs=(~{,root}) # brace expansion occurs in a different order in ksh, so this is bash-only.
   3 letters=({a..z})    # Not all shells with sequence-expansion can use letters.

   1 # Korn
   2 set -A args -- "$@"

2.1. Loading lines from a file or stream

In bash 4, the mapfile command (also known as readarray) accomplishes this:

   1 # Bash 4
   2 mapfile -t lines <myfile
   3 
   4 # or
   5 mapfile -t lines < <(some command)

See ProcessSubstitution and FAQ #24 for more details on the <(...) syntax.

mapfile handles blank lines by inserting them as empty array elements, and (with -t) also silently appends a missing final newline if the input stream lacks one. These can be problematic when reading data in other ways (see the next section). mapfile in bash 4.0 through 4.3 does have one serious drawback: it can only handle newlines as line terminators. Bash 4.4 adds the -d option to supply a different line delimiter.

When mapfile isn't available, we have to work very hard to try to duplicate it. There are a great number of ways to almost get it right, but many of them fail in subtle ways.

The following examples will duplicate most of mapfile's basic functionality in older shells. You can skip all of these alternative examples if you have bash 4.

   1 # Alternative: Bash 3.1, Ksh93, mksh
   2 unset -v lines
   3 while IFS= read -r; do
   4     lines+=("$REPLY")
   5 done <file
   6 [[ $REPLY ]] && lines+=("$REPLY")

The += operator, when used together with parentheses, appends the element to one greater than the current highest numbered index in the array.

   1 # Alternative: ksh88
   2 # Ksh88 doesn't support pre/post increment/decrement. mksh and others do.
   3 i=0
   4 unset -v lines
   5 while IFS= read -r; do
   6     lines[i+=1,$i]=$REPLY     # Mimics lines[i++]=$REPLY
   7 done <file
   8 [[ $REPLY ]] && lines[i]=$REPLY

The square brackets create a math context. The result of the expression is the index used for assignment.

2.1.1. Handling newlines (or lack thereof) at the end of a file

read returns false when it reads the last line of a file. This presents a problem: if the file contains a trailing newline, then read will be false when reading/assigning that final line, otherwise, it will be false when reading/assigning the last line of data. Without a special check for these cases, no matter what logic is used, you will always end up either with an extra blank element in the resulting array, or a missing final element.

To be clear - text files should contain a newline as the last character in the file. Newlines are added to the ends of files by most text editors, and also by Here documents and Here strings. Most of the time, this is only an issue when reading output from pipes or process substitutions, or from "broken" text files created with broken or misconfigured tools. Let's look at some examples.

This approach reads the elements one by one, using a loop.

   1 # Doesn't work correctly!
   2 unset -v arr i
   3 while IFS= read -r 'arr[i++]'; do
   4     :
   5 done < <(printf '%s\n' {a..d})

Unfortunately, if the file or input stream contains a trailing newline, a blank element is added at the end of the array, because the read -r arr[i++] is executed one extra time after the last line containing text before returning false.

   1 # Still doesn't work correctly!
   2 unset -v arr i
   3 while read -r; do
   4     arr[i++]=$REPLY
   5 done < <(printf %s {a..c}$'\n' d)

The square brackets create a math context. Inside them, i++ works as a C programmer would expect (in all but ksh88).

This approach fails in the reverse case - it correctly handles blank lines and inputs terminated with a newline, but fails to record the last line of input. If the file or stream is missing its final newline. So we need to handle that case specially:

   1 # Alternative: Bash, ksh93, mksh
   2 unset -v arr i
   3 while IFS= read -r; do
   4     arr[i++]=$REPLY
   5 done <file
   6 [[ $REPLY ]] && arr[i++]=$REPLY # Append unterminated data line, if there was one.

This is very close to the "final solution" we gave earlier -- handling both blank lines inside the file, and an unterminated final line. The null IFS is used to prevent read from stripping possible whitespace from the beginning and end of lines, in the event you wish to preserve them.

Another workaround is to remove the empty element after the loop:

   1 # Alternative: Bash
   2 unset -v arr i
   3 while IFS= read -r 'arr[i++]'; do
   4     :
   5 done <file
   6 
   7 # Remove trailing empty element, if any.
   8 [[ ${arr[i-1]} ]] || unset -v 'arr[--i]'

Whether you prefer to read too many and then have to remove one, or read too few and then have to add one, is a personal choice.

NOTE: it is necessary to quote the 'arr[i++]' passed to read, so that the square brackets aren't interpreted as globs. This is also true for other non-keyword builtins that take a subscripted variable name, such as let and unset.

2.1.2. Other methods

Sometimes stripping blank lines actually is desirable, or you may know that the input will always be newline delimited, such as input generated internally by your script. It is possible in some shells to use the -d flag to set read's line delimiter to null, then abuse the -a or -A (depending on the shell) flag normally used for reading the fields of a line into an array for reading lines. Effectively, the entire input is treated as a single line, and the fields are newline-delimited.

   1 # Bash 4
   2 IFS=$'\n' read -rd '' -a lines <file

   1 # mksh, zsh
   2 IFS=$'\n' read -rd '' -A lines <file

2.1.3. Don't read lines with for!

Never read lines using for..in loops! Relying on IFS WordSplitting causes issues if you have repeated whitespace delimiters, because they will be consolidated. It is not possible to preserve blank lines by having them stored as empty array elements this way. Even worse, special globbing characters will be expanded without going to lengths to disable and then re-enable it. Just never use this approach - it is problematic, the workarounds are all ugly, and not all problems are solvable.

2.2. Reading NUL-delimited streams

If you are trying to deal with records that might have embedded newlines, you will be using an alternative delimiter such as the NUL character ( \0 ) to separate the records. In bash 4.4, you can simply use mapfile -t -d '':

   1 # Bash 4.4
   2 mapfile -t -d '' files < <(find . -name '*.ugly' -print0)

Otherwise, you'll need to use the -d argument to read inside a loop:

   1 # Bash
   2 while read -rd ''; do
   3     arr[i++]=$REPLY
   4 done < <(find . -name '*.ugly' -print0)
   5 
   6 # or (bash 3.1 and up)
   7 while read -rd ''; do
   8     arr+=("$REPLY")
   9 done < <(find . -name '*.ugly' -print0)

read -d '' tells Bash to keep reading until a NUL byte instead of until a newline. This isn't certain to work in all shells with a -d feature.

If you choose to give a variable name to read instead of using REPLY then also be sure to set IFS= for the read command, to avoid trimming leading/trailing IFS whitespace.

2.3. Appending to an existing array

As previously mentioned, arrays are sparse - that is, numerically adjacent indexes are not guaranteed to be occupied by a value. This confuses what it means to "append" to an existing array. There are several approaches.

If you've been keeping track of the highest-numbered index with a variable (for example, as a side-effect of populating an array in a loop), and can guarantee it's correct, you can just use it and continue to ensure it remains in-sync.

   1 # Bash/ksh93
   2 arr[++i]="new item"

If you don't want to keep an index variable, but happen to know that your array is not sparse, then you can use the number of elements to calculate the offset (not recommended):

   1 # Bash/ksh
   2 # This will FAIL if the array has holes (is sparse).
   3 arr[${#arr[@]}]="new item"

If you don't know whether your array is sparse or not, but don't mind re-indexing the entire array (very inefficient), then you can use:

   1 # Bash
   2 arr=("${arr[@]}" "new item")
   3 
   4 # Ksh
   5 set -A arr -- "${arr[@]}" "new item"

If you're in bash 3.1 or higher, then you can use the += operator:

   1 # Bash 3.1, ksh93, mksh, zsh
   2 arr+=(item 'another item')

NOTE: the parentheses are required, just as when assigning to an array. Otherwise you will end up appending to ${arr[0]} which $arr is a synonym for. If your shell supports this type of appending, it is the preferred method.

For examples of using arrays to hold complex shell commands, see FAQ #50 and FAQ #40.

3. Retrieving values from an array

${#arr[@]} or ${#arr[*]} expand to the number of elements in an array:

   1 # Bash
   2 shopt -s nullglob
   3 oggs=(*.ogg)
   4 echo "There are ${#oggs[@]} Ogg files."

Single elements are retrieved by index:

   1 echo "${foo[0]} - ${bar[j+1]}"

The square brackets are a math context. Within an arithmetic context, variables, including arrays, can be referenced by name. For example, in the expansion:

   1 ${arr[x[3+arr[2]]]}

arr's index will be the value from the array x whose index is 3 plus the value of arr[2].

Using array elements en masse is one of the key features of shell arrays. In exactly the same way that "$@" is expanded for positional parameters, "${arr[@]}" is expanded to a list of words, one array element per word. For example,

   1 # Korn/Bash
   2 for x in "${arr[@]}"; do
   3   echo "next element is '$x'"
   4 done

This works even if the elements contain whitespace. You always end up with the same number of words as you have array elements.

If one simply wants to dump the full array, one element per line, this is the simplest approach:

   1 # Bash/ksh
   2 printf "%s\n" "${arr[@]}"

For slightly more complex array-dumping, "${arr[*]}" will cause the elements to be concatenated together, with the first character of IFS (or a space if IFS isn't set) between them. As it happens, "$*" is expanded the same way for positional parameters.

   1 # Bash
   2 arr=(x y z)
   3 IFS=/; echo "${arr[*]}"; unset -v IFS
   4 # prints x/y/z

Unfortunately, you can't put multiple characters in between array elements using that syntax. You would have to do something like this instead:

   1 # Bash/ksh
   2 arr=(x y z)
   3 tmp=$(printf "%s<=>" "${arr[@]}")
   4 echo "${tmp%<=>}"    # Remove the extra <=> from the end.
   5 # prints x<=>y<=>z

Or using array slicing, described in the next section.

   1 # Bash/ksh
   2 typeset -a a=([0]=x [5]=y [10]=z)
   3 printf '%s<=>' "${a[@]::${#a[@]}-1}"
   4 printf '%s\n' "${a[@]:(-1)}"

This also shows how sparse arrays can be assigned multiple elements at once. Note using the arr=([key]=value ...) notation differs between shells. In ksh93, this syntax gives you an associative array by default unless you specify otherwise, and using it requires that every value be explicitly given an index, unlike bash, where omitted indexes begin at the previous index. This example was written in a way that's compatible between the two.

BASH 3.0 added the ability to retrieve the list of index values in an array:

   1 # Bash 3.0 or higher
   2 arr=(0 1 2 3) arr[42]='what was the question?'
   3 unset -v 'arr[2]'
   4 echo "${!arr[@]}"
   5 # prints 0 1 3 42

Retrieving the indices is extremely important for certain kinds of tasks, such as maintaining parallel arrays with the same indices (a cheap way to mimic having an array of structs in a language with no struct):

   1 # Bash 3.0 or higher
   2 unset -v file title artist i
   3 for f in ./*.mp3; do
   4   file[i]=$f
   5   title[i]=$(mp3info -p %t "$f")
   6   artist[i++]=$(mp3info -p %a "$f")
   7 done
   8 
   9 # Later, iterate over every song.
  10 # This works even if the arrays are sparse, just so long as they all have
  11 # the SAME holes.
  12 for i in "${!file[@]}"; do
  13   echo "${file[i]} is ${title[i]} by ${artist[i]}"
  14 done

3.1. Retrieving with modifications

Bash's Parameter Expansions may be performed on array elements en masse:

   1 # Bash
   2 arr=(abc def ghi jkl)
   3 echo "${arr[@]#?}"          # prints bc ef hi kl
   4 echo "${arr[@]/[aeiou]/}"   # prints bc df gh jkl

Parameter Expansion can also be used to extract sub-lists of elements from an array. Some people call this slicing:

   1 # Bash
   2 echo "${arr[@]:1:3}"        # three elements starting at #1 (second element)
   3 echo "${arr[@]:(-2)}"       # last two elements

The same goes for positional parameters

   1 set -- foo bar baz
   2 echo "${@:(-1)}"            # last positional parameter baz
   3 echo "${@:(-2):1}"          # second-to-last positional parameter bar

4. Using @ as a pseudo-array

As we see above, the @ array (the array of positional parameters) can be used almost like a regularly named array. This is the only array available for use in POSIX or Bourne shells. It has certain limitations: you cannot individually set or unset single elements, and it cannot be sparse. Nevertheless, it still makes certain POSIX shell tasks possible that would otherwise require external tools:

   1 # POSIX
   2 set -- *.mp3
   3 if [ -e "$1" ] || [ -L "$1" ]; then
   4   echo "there are $# MP3 files"
   5 else
   6   echo "there are 0 MP3 files"
   7 fi

   1 # POSIX
   2 ...
   3 # Add an option to our dynamically generated list of options
   4 set -- "$@" -f "$somefile"
   5 ...
   6 foocommand "$@"

(Compare to FAQ #50's dynamically generated commands using named arrays.)

See Also


CategoryShell

BashFAQ/005 (last edited 2023-03-25 22:39:06 by emanuele6)