Size: 12088
Comment: Array creation and usage.
|
Size: 15721
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
<<Anchor(Patterns)>> | [[BashGuide/Patterns|<- Patterns]] | [[BashGuide/TestsAndConditionals|Tests and Conditionals ->]] ---- |
Line 6: | Line 7: |
Strings are without a doubt the most used parameter type. But they are also the most mis-used parameter type. It is important to remember that a string holds just ''one'' element. Capturing the output of a command, for instance, and putting it in a string parameter means that parameter holds just ''one'' string of characters. Whether that string represents twenty filenames, twenty numbers or twenty names of people. And as is always the case when you put multiple items in a single string: these multiple items are somehow delimited from each other. We, humans, can usually decipher which the different filenames are when looking at a string. We assume that, perhaps, each line in the string represents a filename. Or each word represents a filename. While this assumption is understandable, it is also inheritly flawed: Each single filename can contain every character you might want to use to separate the filenames from each other in a string. That means there's technically no telling where the first filename in the string ends, because there's no character that can say: "I denote the end of this filename" because that character itself could be part of the filename. |
Strings are without a doubt the most used parameter type. But they are also the most misused parameter type. It is important to remember that a string holds just '''one''' element. Capturing the output of a command, for instance, and putting it in a string parameter means that parameter holds just '''one''' string of characters, regardless of whether that string represents twenty filenames, twenty numbers or twenty names of people. And as is always the case when you put multiple items in a single string, these multiple items must be somehow delimited from each other. We, as humans, can usually decipher what the different filenames are when looking at a string. We assume that, perhaps, each line in the string represents a filename, or each word represents a filename. While this assumption is understandable, it is also inherently flawed. Each single filename can contain every character you might want to use to separate the filenames from each other in a string. That means there's technically no telling where the first filename in the string ends, because there's no character that can say: "I denote the end of this filename" because that character itself could be part of the filename. |
Line 13: | Line 14: |
# This does NOT work in the general case | |
Line 16: | Line 18: |
This attempt at backing up our files in the current directory is flawed. We put the output of `ls` in a string called `files` and then use the ''unquoted'' `$files` parameter expansion to cut that string into arguments (relying on '''Word Splitting'''). As mentioned before; argument and word splitting cuts a string into pieces wherever there is whitespace. Relying on it means we assume that none of our filenames will contain any whitespace. If they do, the filename will be cut in half or more. Conclusion: ''bad''. The only safe way to represent ''multiple'' string elements in bash is through the use of arrays. Arrays are types that ''map integers to strings''. That basically means that they hold a numbered list of strings. Since each of these strings is a separate entity (element), it can safely contain any character. For the best results and the least headaches, remember that if you have a list of things that you need to remember: Always put it in an array. Also note that since arrays map integers to strings you can't use them to do any other mapping. For instance, you cannot use strings as keys or other arrays as values. Usually, if you think you need something like this, you can change the logic of your script such that you don't need it anymore. If you can't, you should probably consider switching to a language that's built for this sort of job (perl, python, ruby) or use multiple arrays with a shared index in bash. |
This attempt at backing up our files in the current directory is flawed. We put the output of `ls` in a string called `files` and then use the '''unquoted''' `$files` parameter expansion to cut that string into arguments (relying on ''Word Splitting''). As mentioned before, argument and word splitting cuts a string into pieces wherever there is whitespace. Relying on it means we assume that none of our filenames will contain any whitespace. If they do, the filename will be cut in half or more. Conclusion: '''bad'''. The only safe way to represent '''multiple''' string elements in bash is through the use of arrays. Arrays are types that '''map integers to strings'''. That basically means that they hold a numbered list of strings. Since each of these strings is a separate entity (element), it can safely contain any character. For the best results and the least headaches, remember that if you have a list of things, you should always put it in an array. Unlike some other programming languages, bash does not offer lists, tuples, etc. Just arrays, and associative arrays (which are new in bash 4). |
Line 41: | Line 43: |
This syntax is great for creating arrays with static data or a known set of string parameters. Though it gives us very little flexibility for adding lots of array elements. When the elements in the arrays are filenames, then you'll probably want to use `Globs` in there: {{{ $ photos=(~/"My Photos"/*jpg) }}} Notice here that we quoted the `My Photos` part because it contains a space. If we hadn't quoted it, bash would have split it up into `photos=('~/My' 'Photos/'*.jpg )` which is obviously ''not'' what we want. Also notice that we quoted ''only'' the part that contained the space. That's because we cannot quote the `~` or the `*`; if we do, they'll become literal and bash won't treat them as special characters anymore. Creating arrays with a bunch of filenames becomes really easy like this. So remember to ''never'' use `ls`: {{{ $ files=$(ls) # BAD, BAD, BAD! $ files=(*) # Good! }}} The first would create a ''string'' with the output of `ls`. That string cannot possibly be used safely for reasons mentioned in the `Arrays` introduction. The second statement gives us an array where each filename is a separate string. Perfect. Now, sometimes we want to build an array from a string or the output of a command. Commands (generally) just output strings: for instance, running a `find` command will enumerate filenames, and separate these filenames with newlines (putting each filename on a separate line). So to parse that one big string into an array we need to tell bash what character delimits the parts of the string that we want to put in separate array elements. That's what `IFS` is used for: |
This syntax is great for creating arrays with static data or a known set of string parameters, but it gives us very little flexibility for adding lots of array elements. If you need more flexibility, you can also specify explicit indexes: {{{ $ names=([0]="Bob" [1]="Peter" [20]="$USER" [21]="Big Bad John") # or... $ names[0]="Bob" }}} Notice that there is a gap between indices 1 and 20 in this example. An array with holes in it is called a ''sparse array''. Bash allows this, and it can often be quite useful. If you want to fill an array with filenames, then you'll probably want to use ''Globs'' in there: {{{ $ photos=(~/"My Photos"/*.jpg) }}} Notice here that we quoted the `My Photos` part because it contains a space. If we hadn't quoted it, bash would have split it up into `photos=('~/My' 'Photos/'*.jpg )` which is obviously '''not''' what we want. Also notice that we quoted '''only''' the part that contained the space. That's because we cannot quote the `~` or the `*`; if we do, they'll become literal and bash won't treat them as special characters anymore. Creating arrays with a bunch of filenames becomes really easy like this. So remember to '''never''' use `ls`: {{{ $ files=$(ls) # BAD, BAD, BAD! $ files=($(ls)) # STILL BAD! $ files=(*) # Good! }}} The first would create a '''string''' with the output of `ls`. That string cannot possibly be used safely for reasons mentioned in the `Arrays` introduction. The second is closer, but it still splits up filenames with whitespace. The third statement gives us an array where each filename is a separate element. Perfect. Now, sometimes we want to build an array from a string or the output of a command. Commands (generally) just output strings: for instance, running a `find` command will enumerate filenames, and separate these filenames with newlines (putting each filename on a separate line). So to parse that one big string into an array we need to tell bash what character delimits the parts of the string that we want to put in separate array elements. (Note, this is a bad example, because filenames can '''contain''' a newline, so it is not safe to delimit them with newlines! But see below.) Breaking up a string is what `IFS` is used for: |
Line 70: | Line 81: |
Very often, though, it's impossible to safely tell what the delimiter is. For instance, when `find` outputs filenames separated by a newline, using a newline to delimit our parsing of that output string is flawed in that a filename can itself contain a newline. We cannot assume that each newline means: "A new filename follows", which means we cannot parse the output safely. And if we can't even parse it safely, there's little point in parsing it badly and then putting it in a safe array: The array would just contain badly parsed data. The answer to this problem is `NULL` bytes. The main difference between strings and the output of commands is that the latter is a `stream`, not a string. Streams are just like strings with one big difference: They can contain `NULL` bytes, strings cannot. A `NULL` byte is a byte which is just all zeros: `00000000`. The reason that they can't be used in strings is an artifact of the `C` programming language: `NULL` bytes are used in your computer memory to mark the end of a string. That way, when you output a string, you don't output everything in your memory starting at the string's location until the end of your memory but just until it reaches a `NULL` byte. Streams ''can'' contain `NULL` bytes, and we will use them to delimit our data. Since strings can't contain `NULL` bytes, neither can filenames or anything else we may want to put in an array element. That makes it the perfect candidate for safe array creation. Usually, the command you want to read the output of has an option that makes it output its data separated by `NULL` bytes rather than newlines or something else. `find` has the option `-print0`, which we'll use in this example: |
Now, is there any way to get a list of filenames from an external program (like `find`) into a bash array? The answer to this problem is `NULL` bytes. The main difference between strings and the output of commands is that the latter is a ''stream'', not a string. Streams are like strings with two big differences: they are read sequentially (you can't jump around), and they can contain `NULL` bytes. A `NULL` byte is a byte which is just all zeros: `00000000`. The reason that they can't be used in strings is an artifact of the `C` programming language: `NULL` bytes are used in C to mark the end of a string. Since bash is written in C and uses C's native strings, it inherits that behavior. Streams '''can''' contain `NULL` bytes, and we will use them to delimit our data. Filenames cannot contain `NULL` bytes (since they're implemented as C strings by Unix), and neither can the vast majority of things we would want to store in a program (people's names, IP addresses, etc.). That makes `NULL` a great candidate for separating elements in a stream. Quite often, the command you want to read the output of has an option that makes it output its data separated by `NULL` bytes rather than newlines or something else. `find` (on GNU and BSD, anyway) has the option `-print0`, which we'll use in this example: |
Line 83: | Line 92: |
This is the only safe way of parsing a command's output into a string. Understandably, it looks a little confusing and convoluted at first. So let's take it apart: We're using a `while` loop that runs a `read` command each time. The `read` command uses the `-d $'\0'` option, which means that instead of reading a line (up to a newline), we're reading up to a `NULL` byte (`\0`). Once `read` has read some data and encountered its `NULL` byte, the `while` loop's body is executed where we put what we read (which is in `REPLY`) into our array. To do this, we use the `+=()` syntax. This syntax adds an (or more) element(s) to the end of our array. Finally, the `< <(..)` syntax is a combination of '''File Redirection''' (`<`) and '''Process Substitution''' (`<(..)`) which is used to redirect the output of the `find` command into our `while` loop. |
This is a safe way of parsing a command's output into strings. Understandably, it looks a little confusing and convoluted at first. So let's take it apart: The first line `files=()` creates an empty array named `files`. We're using a `while` loop that runs a `read` command each time. The `read` command uses the `-d $'\0'` option, which means that instead of reading a line (up to a newline), we're reading up to a `NULL` byte (`\0`). It also uses `-r` to prevent it from treating backslashes specially. Once `read` has read some data and encountered a `NULL` byte, the `while` loop's body is executed. We put what we read (which is in the parameter `REPLY`) into our array. To do this, we use the `+=()` syntax. This syntax adds one or more element(s) to the end of our array. Finally, the `< <(..)` syntax is a combination of ''File Redirection'' (`<`) and ''Process Substitution'' (`<(..)`) which is used to redirect the output of the `find` command into our `while` loop. |
Line 96: | Line 107: |
. '''Good Practice: <<BR>> Arrays are a safe list of strings. They are perfect for representing multiple filenames.<<BR>> Make sure to always use the `NULL`-byte method of parsing command output whenever you're parsing output that can contain any string character in each element.''' | . '''Good Practice: <<BR>> Arrays are a safe list of strings. They are perfect for representing multiple filenames.<<BR>> Make sure to always use the `NULL`-byte method of parsing command output whenever you're parsing output that can contain any string character within an element.''' |
Line 108: | Line 119: |
Walking over array elements is really easy. Because arrays are such a safe medium of storage, we can simply use a `for` loop to iterate over its elements: | Walking over array elements is really easy. Because an array is such a safe medium of storage, we can simply use a `for` loop to iterate over its elements: |
Line 116: | Line 127: |
Notice the syntax used to ''expand'' the array here. We use the ''quoted'' form: `"${arrayname[@]}"`. That causes bash to replace it with every single element in the array, properly quoted. For instance, these are identical: | Notice the syntax used to '''expand''' the array here. We use the '''quoted''' form: `"${arrayname[@]}"`. That causes bash to replace it with every single element in the array, properly quoted. For instance, these are identical: |
Line 124: | Line 135: |
Remember to ''quote'' the `${arrayname[@]}` properly. If you don't you loose all benefit of using an array at all: You're telling bash it's OK to wordsplit your array elements to pieces and break everything again. | Remember to '''quote''' the `${arrayname[@]}` properly. If you don't you lose all benefit of using an array at all: you're telling bash it's OK to wordsplit your array elements to pieces and break everything again. |
Line 129: | Line 140: |
cp "${myfiles[@]}" /backups/ }}} This runs the `cp` command, replaces the `"${myfiles[@]}"` part by each filename in the `myfiles` array, properly quoted, causing `cp` to safely copy them to your backups. You can also expand single array elements by referencing their element number (called '''index'''). Though remember that arrays are '''zero-based''', which means that their ''first element'' has the index ''zero'': |
cp -- "${myfiles[@]}" /backups/ }}} This runs the `cp` command, replaces the `"${myfiles[@]}"` part by each filename in the `myfiles` array, properly quoted, causing `cp` to safely copy them to your backups. (We use `--` to tell `cp` that there are no options after that point. This is just in case one of our filenames begins with a `-` character, which could confuse `cp` into thinking it is an option.) You can also expand single array elements by referencing their element number (called '''index'''). Though remember that arrays are ''zero-based'', which means that their '''first element''' has the index '''zero''': |
Line 141: | Line 152: |
There is also a second form of expanding all array elements, which is `"${arrayname[*]}"`. This form is ''ONLY'' useful for converting arrays back into a single string. The only real purpose for this is outputting the array to humans: | There is also a second form of expanding all array elements, which is `"${arrayname[*]}"`. This form is '''ONLY''' useful for converting arrays back into a single string. The main purpose for this is outputting the array to humans: |
Line 149: | Line 160: |
Remember to still keep everything nicely ''quoted''! If you don't keep `${arrayname[*]}` quoted, once again bash's wordsplitting will cut it into bits. You can combine `IFS` with `${arrayname[*]}` to indicate the character to use to delimit your array elements as you merge them into a single string. This is handy, for example, when you want to comma delimit names: |
Remember to still keep everything nicely '''quoted'''! If you don't keep `${arrayname[*]}` quoted, once again bash's ''Wordsplitting'' will cut it into bits. You can combine `IFS` with `"${arrayname[*]}"` to indicate the character to use to delimit your array elements as you merge them into a single string. This is handy, for example, when you want to comma delimit names: |
Line 159: | Line 170: |
Notice how in this example we put the `IFS=,; echo ...` statement in a '''Subshell''' by wrapping `(` and `)` around it. We do this because we don't want to change the default value of `IFS` in the main shell. As soon as the subshell exits, `IFS` is still its default value and no longer just a comma. This is important because `IFS` is used for a lot of things, and changing its value to something non-default will result in very odd behaviour if you don't expect it! -------- . '''Good Practice: <<BR>> Always quote your array expansions properly, just like you would your normal parameter expansions.<<BR>>Use `"${myarray[@]}"` to expand all your array elements and ''ONLY'' use `"${myarray[*]}"` when you want to merge all your array elements into a single string.''' -------- |
Notice how in this example we put the `IFS=,; echo ...` statement in a [[BashGuide/CompoundCommands#Subshells|Subshell]] by wrapping `(` and `)` around it. We do this because we don't want to change the default value of `IFS` in the main shell. As soon as the subshell exits, `IFS` is still its default value and no longer just a comma. This is important because `IFS` is used for a lot of things, and changing its value to something non-default will result in very odd behavior if you don't expect it! Alas, the `"${array[*]}"` expansion only uses the ''first'' character of `IFS` to join the elements together. If we wanted to separate the names in the previous example with a comma and a space, we would have to use some other technique (for example, a `for` loop). -------- . '''Good Practice: <<BR>> Always quote your array expansions properly, just like you would your normal parameter expansions.<<BR>>Use `"${myarray[@]}"` to expand all your array elements and ONLY use `"${myarray[*]}"` when you want to merge all your array elements into a single string.''' -------- <<Anchor(Associative_Arrays)>> === Associative Arrays === Until recent, [[BASH]] could only use numbers (more specifically, non-negative integers) as keys of arrays. This means you could not "map" or "translate" one string to another. This is something a lot of people missed. People began to (ab)use [[BashFAQ/006|variable indirection]] as a means to address the issue. Since [[BASH]] 4 was released, there is no longer any excuse to use indirection (or '''worse''', `eval`) for this purpose. You can now use full-featured associative arrays. To create an associative array, you need to declare it as such (using `declare -A`). This is to guarantee backward compatibility with the standard indexed arrays. Here's how you do that: {{{ $ declare -A fullNames $ fullNames=( ["lhunath"]="Maarten Billemont" ["greycat"]="Greg Wooledge" ) $ echo "Current user is: $USER. Full name: ${fullNames[$USER]}." Current user is: lhunath. Full name: Maarten Billemont. }}} With the same syntax as for indexed arrays, you can iterate over the keys of associative arrays: {{{ $ for user in "${!fullNames[@]}" > do echo "User: $user, full name: ${fullNames[$user]}."; done User: lhunath, full name: Maarten Billemont. User: greycat, full name: Greg Wooledge. }}} Two things to remember, here: First, the order of the keys you get back from an associative array using the `${!array[@]}` syntax is unpredictable; it won't necessarily be the order in which you assigned elements, or any kind of sorted order. Second, you cannot omit the `$` if you're using a parameter as the key of an associative array. With standard indexed arrays, the `[...]` part is actually an arithmetic context (really, you can do math there without an explicit `$((...))` markup). In an arithmetic context, a ''Name'' can't possibly be a valid number, and so BASH assumes it's a parameter and that you want to use its content. This doesn't work with associative arrays, since a ''Name'' could just as well be a valid associative array key. Let's demonstrate with examples: {{{ $ indexedArray=( "one" "two" ) $ declare -A associativeArray=( ["foo"]="bar" ["alpha"]="omega" ) $ index=0 key="foo" $ echo "${indexedArray[$index]}" one $ echo "${indexedArray[index]}" one $ echo "${indexedArray[index + 1]}" two $ echo "${associativeArray[$key]}" bar $ echo "${associativeArray[key]}" $ echo "${associativeArray[key + 1]}" }}} As you can see, both `$index` and `index` work fine with indexed arrays. They both evaluate to `0`. You can even do math on it to increase it to `1` and get the second value. No go with associative arrays, though. Here, we need to use `$key`; the others fail. -------- [[BashGuide/Patterns|<- Patterns]] | [[BashGuide/TestsAndConditionals|Tests and Conditionals ->]] |
<- Patterns | Tests and Conditionals ->
Arrays
As mentioned earlier, BASH provides three types of parameters: Strings, Integers and Arrays.
Strings are without a doubt the most used parameter type. But they are also the most misused parameter type. It is important to remember that a string holds just one element. Capturing the output of a command, for instance, and putting it in a string parameter means that parameter holds just one string of characters, regardless of whether that string represents twenty filenames, twenty numbers or twenty names of people.
And as is always the case when you put multiple items in a single string, these multiple items must be somehow delimited from each other. We, as humans, can usually decipher what the different filenames are when looking at a string. We assume that, perhaps, each line in the string represents a filename, or each word represents a filename. While this assumption is understandable, it is also inherently flawed. Each single filename can contain every character you might want to use to separate the filenames from each other in a string. That means there's technically no telling where the first filename in the string ends, because there's no character that can say: "I denote the end of this filename" because that character itself could be part of the filename.
Often, people make this mistake:
# This does NOT work in the general case $ files=$(ls); cp $files /backups/
This attempt at backing up our files in the current directory is flawed. We put the output of ls in a string called files and then use the unquoted $files parameter expansion to cut that string into arguments (relying on Word Splitting). As mentioned before, argument and word splitting cuts a string into pieces wherever there is whitespace. Relying on it means we assume that none of our filenames will contain any whitespace. If they do, the filename will be cut in half or more. Conclusion: bad.
The only safe way to represent multiple string elements in bash is through the use of arrays. Arrays are types that map integers to strings. That basically means that they hold a numbered list of strings. Since each of these strings is a separate entity (element), it can safely contain any character.
For the best results and the least headaches, remember that if you have a list of things, you should always put it in an array.
Unlike some other programming languages, bash does not offer lists, tuples, etc. Just arrays, and associative arrays (which are new in bash 4).
Array: An array is a numbered list of strings: It maps integers to strings.
Creating Arrays
There are several ways you can create or fill your array with data. There is no one single true way: the method you'll need depends on where your data comes from and what it is.
The easiest way to create a simple array with data is by using the =() syntax:
$ names=("Bob" "Peter" "$USER" "Big Bad John")
This syntax is great for creating arrays with static data or a known set of string parameters, but it gives us very little flexibility for adding lots of array elements. If you need more flexibility, you can also specify explicit indexes:
$ names=([0]="Bob" [1]="Peter" [20]="$USER" [21]="Big Bad John") # or... $ names[0]="Bob"
Notice that there is a gap between indices 1 and 20 in this example. An array with holes in it is called a sparse array. Bash allows this, and it can often be quite useful.
If you want to fill an array with filenames, then you'll probably want to use Globs in there:
$ photos=(~/"My Photos"/*.jpg)
Notice here that we quoted the My Photos part because it contains a space. If we hadn't quoted it, bash would have split it up into photos=('~/My' 'Photos/'*.jpg ) which is obviously not what we want. Also notice that we quoted only the part that contained the space. That's because we cannot quote the ~ or the *; if we do, they'll become literal and bash won't treat them as special characters anymore.
Creating arrays with a bunch of filenames becomes really easy like this. So remember to never use ls:
$ files=$(ls) # BAD, BAD, BAD! $ files=($(ls)) # STILL BAD! $ files=(*) # Good!
The first would create a string with the output of ls. That string cannot possibly be used safely for reasons mentioned in the Arrays introduction. The second is closer, but it still splits up filenames with whitespace. The third statement gives us an array where each filename is a separate element. Perfect.
Now, sometimes we want to build an array from a string or the output of a command. Commands (generally) just output strings: for instance, running a find command will enumerate filenames, and separate these filenames with newlines (putting each filename on a separate line). So to parse that one big string into an array we need to tell bash what character delimits the parts of the string that we want to put in separate array elements. (Note, this is a bad example, because filenames can contain a newline, so it is not safe to delimit them with newlines! But see below.)
Breaking up a string is what IFS is used for:
$ IFS=. read -a ip_elements <<< "127.0.0.1"
Here we use IFS with the value . to cut the given ip address into array elements wherever there's a ., resulting in an array with the elements 127, 0, 0 and 1.
Now, is there any way to get a list of filenames from an external program (like find) into a bash array? The answer to this problem is NULL bytes. The main difference between strings and the output of commands is that the latter is a stream, not a string. Streams are like strings with two big differences: they are read sequentially (you can't jump around), and they can contain NULL bytes. A NULL byte is a byte which is just all zeros: 00000000. The reason that they can't be used in strings is an artifact of the C programming language: NULL bytes are used in C to mark the end of a string. Since bash is written in C and uses C's native strings, it inherits that behavior.
Streams can contain NULL bytes, and we will use them to delimit our data. Filenames cannot contain NULL bytes (since they're implemented as C strings by Unix), and neither can the vast majority of things we would want to store in a program (people's names, IP addresses, etc.). That makes NULL a great candidate for separating elements in a stream. Quite often, the command you want to read the output of has an option that makes it output its data separated by NULL bytes rather than newlines or something else. find (on GNU and BSD, anyway) has the option -print0, which we'll use in this example:
files=() while read -r -d $'\0'; do files+=("$REPLY") done < <(find /foo -print0)
This is a safe way of parsing a command's output into strings. Understandably, it looks a little confusing and convoluted at first. So let's take it apart:
The first line files=() creates an empty array named files.
We're using a while loop that runs a read command each time. The read command uses the -d $'\0' option, which means that instead of reading a line (up to a newline), we're reading up to a NULL byte (\0). It also uses -r to prevent it from treating backslashes specially.
Once read has read some data and encountered a NULL byte, the while loop's body is executed. We put what we read (which is in the parameter REPLY) into our array.
To do this, we use the +=() syntax. This syntax adds one or more element(s) to the end of our array.
Finally, the < <(..) syntax is a combination of File Redirection (<) and Process Substitution (<(..)) which is used to redirect the output of the find command into our while loop.
The find command itself uses the -print0 option as mentioned before to tell it to separate the filenames it finds with a NULL byte.
Good Practice:
Arrays are a safe list of strings. They are perfect for representing multiple filenames.
Make sure to always use the NULL-byte method of parsing command output whenever you're parsing output that can contain any string character within an element.
In The Manual: Arrays
In the FAQ:
How can I use array variables?
How can I use variable variables (indirect variables, pointers, references) or associative arrays?
Using Arrays
Walking over array elements is really easy. Because an array is such a safe medium of storage, we can simply use a for loop to iterate over its elements:
$ for file in "${myfiles[@]}"; do > cp "$file" /backups/ > done
Notice the syntax used to expand the array here. We use the quoted form: "${arrayname[@]}". That causes bash to replace it with every single element in the array, properly quoted. For instance, these are identical:
$ names=("Bob" "Peter" "$USER" "Big Bad John") $ for name in "${names[@]}"; do :; done $ for name in "Bob" "Peter" "$USER" "Big Bad John"; do :; done
Remember to quote the ${arrayname[@]} properly. If you don't you lose all benefit of using an array at all: you're telling bash it's OK to wordsplit your array elements to pieces and break everything again.
Another use of "${arrayname[@]}" is simplifying the above loop, for instance:
cp -- "${myfiles[@]}" /backups/
This runs the cp command, replaces the "${myfiles[@]}" part by each filename in the myfiles array, properly quoted, causing cp to safely copy them to your backups. (We use -- to tell cp that there are no options after that point. This is just in case one of our filenames begins with a - character, which could confuse cp into thinking it is an option.)
You can also expand single array elements by referencing their element number (called index). Though remember that arrays are zero-based, which means that their first element has the index zero:
$ echo "The first name is: ${names[0]}" $ echo "The second name is: ${names[1]}"
There is also a second form of expanding all array elements, which is "${arrayname[*]}". This form is ONLY useful for converting arrays back into a single string. The main purpose for this is outputting the array to humans:
$ names=("Bob" "Peter" "$USER" "Big Bad John") $ echo "Today's contestants are: ${names[*]}" Today's contestants are: Bob Peter lhunath Big Bad John
Remember to still keep everything nicely quoted! If you don't keep ${arrayname[*]} quoted, once again bash's Wordsplitting will cut it into bits.
You can combine IFS with "${arrayname[*]}" to indicate the character to use to delimit your array elements as you merge them into a single string. This is handy, for example, when you want to comma delimit names:
$ names=("Bob" "Peter" "$USER" "Big Bad John") $ ( IFS=,; echo "Today's contestants are: ${names[*]}" ) Today's contestants are: Bob,Peter,lhunath,Big Bad John
Notice how in this example we put the IFS=,; echo ... statement in a Subshell by wrapping ( and ) around it. We do this because we don't want to change the default value of IFS in the main shell. As soon as the subshell exits, IFS is still its default value and no longer just a comma. This is important because IFS is used for a lot of things, and changing its value to something non-default will result in very odd behavior if you don't expect it!
Alas, the "${array[*]}" expansion only uses the first character of IFS to join the elements together. If we wanted to separate the names in the previous example with a comma and a space, we would have to use some other technique (for example, a for loop).
Good Practice:
Always quote your array expansions properly, just like you would your normal parameter expansions.
Use "${myarray[@]}" to expand all your array elements and ONLY use "${myarray[*]}" when you want to merge all your array elements into a single string.
Associative Arrays
Until recent, BASH could only use numbers (more specifically, non-negative integers) as keys of arrays. This means you could not "map" or "translate" one string to another. This is something a lot of people missed. People began to (ab)use variable indirection as a means to address the issue.
Since BASH 4 was released, there is no longer any excuse to use indirection (or worse, eval) for this purpose. You can now use full-featured associative arrays.
To create an associative array, you need to declare it as such (using declare -A). This is to guarantee backward compatibility with the standard indexed arrays. Here's how you do that:
$ declare -A fullNames $ fullNames=( ["lhunath"]="Maarten Billemont" ["greycat"]="Greg Wooledge" ) $ echo "Current user is: $USER. Full name: ${fullNames[$USER]}." Current user is: lhunath. Full name: Maarten Billemont.
With the same syntax as for indexed arrays, you can iterate over the keys of associative arrays:
$ for user in "${!fullNames[@]}" > do echo "User: $user, full name: ${fullNames[$user]}."; done User: lhunath, full name: Maarten Billemont. User: greycat, full name: Greg Wooledge.
Two things to remember, here: First, the order of the keys you get back from an associative array using the ${!array[@]} syntax is unpredictable; it won't necessarily be the order in which you assigned elements, or any kind of sorted order.
Second, you cannot omit the $ if you're using a parameter as the key of an associative array. With standard indexed arrays, the [...] part is actually an arithmetic context (really, you can do math there without an explicit $((...)) markup). In an arithmetic context, a Name can't possibly be a valid number, and so BASH assumes it's a parameter and that you want to use its content. This doesn't work with associative arrays, since a Name could just as well be a valid associative array key.
Let's demonstrate with examples:
$ indexedArray=( "one" "two" ) $ declare -A associativeArray=( ["foo"]="bar" ["alpha"]="omega" ) $ index=0 key="foo" $ echo "${indexedArray[$index]}" one $ echo "${indexedArray[index]}" one $ echo "${indexedArray[index + 1]}" two $ echo "${associativeArray[$key]}" bar $ echo "${associativeArray[key]}" $ echo "${associativeArray[key + 1]}"
As you can see, both $index and index work fine with indexed arrays. They both evaluate to 0. You can even do math on it to increase it to 1 and get the second value. No go with associative arrays, though. Here, we need to use $key; the others fail.