Differences between revisions 3 and 15 (spanning 12 versions)
Revision 3 as of 2007-11-08 14:52:05
Size: 4066
Editor: GreyCat
Comment: remove all eval examples. give better examples.
Revision 15 as of 2010-04-20 20:54:03
Size: 7723
Editor: GreyCat
Comment: fix reads without -r, put IFS in appropriate places, mention limitations of each approach
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
[[Anchor(faq6)]] <<Anchor(faq6)>>
Line 3: Line 3:
Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages: There are two halves to this: evaluating variables, and assigning values. We'll take each half separately:
Line 5: Line 5:
 1. It's hard to read and to maintain.
 1. The variable names must match the regular expression {{{^[a-zA-Z_][a-zA-Z_0-9]*}}} -- i.e., a variable name cannot contain arbitrary characters but only letters, digits, and underscores. We cannot have a variable's name contain Unix usernames; consider a user named {{{hong-hu}}}. A dash '-' cannot be a valid part of a variable name.
 1. Quoting is hard to get right. If content strings (not variable name) can contain whitespace characters and quotes, it's hard to quote it right to preserve it.
 1. If the program handles unsanitized user input, it can be [#faq48 VERY dangerous]!
=== Evaluating indirect/reference variables ===
[[BASH]] allows you to expand a parameter ''indirectly'' -- that is, one variable may contain the name of another variable:
 {{{
 # Bash
 realvariable=contents
 ref=realvariable
 echo "${!ref}" # prints the contents of the real variable
 }}}
Line 10: Line 14:
Bash (but not Korn shell, POSIX or Bourne shell) allows you to expand a parameter ''indirectly'' -- that is, one variable may contain the name of another variable:
  {{{
  realvariable=contents
  ref=realvariable
  echo "${!ref}" # prints the contents of the real variable}}}
KornShell (ksh93) has a completely different, more powerful syntax -- the `nameref` command (also known as `typeset -n`):
 {{{
 # ksh93
 realvariable=contents
 nameref ref=realvariable
 echo "$ref" # prints the contents of the real variable
 }}}
Line 16: Line 22:
This works for evaluating, but not for assigning a value. In order to assign a value "through" a reference (or pointer, or indirect variable, or whatever you want to call it -- I'm going to use "ref" from now on), you have to resort to tricks. ksh93's `nameref` allows us to work with references to [[BashFAQ/005|arrays]], as well as regular scalar variables. For example,
 {{{
 # ksh93
 myfunc() {
   nameref ref=$1
   echo "array $1 has ${#ref[*]} elements"
 }
 realarray=(...)
 myfunc realarray
 }}}
Line 18: Line 33:
One such trick is to use {{{read}}} and Bash's ''here string'' syntax:
  {{{
  ref=realvariable
  read $ref <<< "contents"
  # realvariable now contains the string "contents"}}}
We are not aware of any trick that can duplicate that functionality in Bash, POSIX or Bourne shells (short of using [[BashFAQ/048|eval]], which is extremely difficult to do securely).
Line 24: Line 35:
This works equally well with Bash array variables too:
  {{{
  aref=realarray
  read -a $aref <<< "words go into array elements"
  echo "${realarray[1]}" # prints "go"}}}
Unfortunately, for shells other than Bash and ksh93, there is no syntax for ''evaluating'' a referenced variable. You would have to use [[BashFAQ/048|eval]], which means you would have to undergo extreme measures to sanitize your data to avoid catastrophe.
Line 30: Line 37:
Another is to use Bash's {{{printf -v}}} (only available in [#faq61 recent versions]):
  {{{
  ref=realvariable
  printf -v $ref "contents"}}}
=== Assigning indirect/reference variables ===
Assigning a value "through" a reference (or pointer, or indirect variable, or whatever you want to call it -- I'm going to use "ref" from now on) is more widely possible, but the means of doing so are extremely shell-specific.
Line 35: Line 40:
The {{{printf -v}}} trick is handy if your contents aren't a constant string, but rather, something dynamically generated. You can use all of {{{printf}}}'s formatting capabilities. In ksh93, we can just use `nameref` again:
 {{{
 # ksh93
 nameref ref=realvariable
 ref="contents"
 # realvariable now contains the string "contents"
 }}}
Line 37: Line 48:
Yet another is Korn shell's {{{typeset}}} or Bash's {{{declare}}}. These are roughly equivalent to each other. Both of them cause a variable to become ''locally scoped'' to a function, if used inside a function; but if used outside a function, they can substitute for {{{read}}} in this case: In Bash, we can use {{{read}}} and Bash's ''here string'' syntax:
 {{{
 # Bash
 ref=realvariable
 IFS= read -r $ref <<< "contents"
 # realvariable now contains the string "contents"
 }}}
However, this only works if there are no newlines in the content. If you need to assign multiline values, keep reading.
Line 39: Line 57:
  {{{
  # Korn shell:
  typeset $ref="contents"
A similar trick works for Bash array variables too:
 {{{
 # Bash
 aref=realarray
 read -r -a $aref <<< "words go into array elements"
 echo "${realarray[1]}" # prints "go"
 }}}
(Again, newlines in the input will break this trick. [[IFS]] is used to delimit words, so you may or may not need to set that.)
Line 43: Line 66:
  # Bash:
  declare $ref="contents"}}}
Another trick is to use Bash's {{{printf -v}}} (only available in [[BashFAQ/061|recent versions]]):
 {{{
 # Bash 3.1 or higher
 ref=realvariable
 printf -v $ref %s "contents"
 }}}
Line 46: Line 73:
If you aren't using Bash or Korn shell, you can still do assignments to referenced variables using ''here document'' syntax:
  {{{
  # Portable code.
  ref=realvariable
  read $ref <<EOF
  contents
  EOF}}}
The {{{printf -v}}} trick is handy if your contents aren't a constant string, but rather, something dynamically generated. You can use all of {{{printf}}}'s formatting capabilities. This trick also permits any string content, including embedded newlines (but not NUL bytes - no force in the universe can put NUL bytes into shell strings usefully). This is the best trick to use if you're in bash 3.1 or higher.

Yet another trick is Korn shell's {{{typeset}}} or Bash's {{{declare}}}. These are roughly equivalent to each other. Both of them cause a variable to become ''locally scoped'' to a function, if used inside a function; but if used outside a function, they can operate on global variables.

 {{{
 # Korn shell (all versions):
 typeset $ref="contents"

 # Bash:
 declare $ref="contents"
 }}}

The advantage of using `typeset` or `declare` over `eval` is that the right hand side of the assignment is ''not'' parsed by the shell. If you used `eval` here, you would have to sanitize/escape the entire right hand side first. This trick also preserves the contents exactly, including newlines, so this is the best trick to use if you're in bash older than 3.1 (or ksh88) and don't need to worry about accidentally changing your variable's scope (i.e., you're not using it inside a function).

If you aren't using Bash or Korn shell, you can do assignments to referenced variables using ''here document'' syntax:
 {{{
 # Bourne
 ref=realvariable
 read $ref <<EOF
 contents
 EOF
 }}}
(Alas, `read` means we're back to only getting at most one line of content. This is the most portable trick, but it's limited to single-line content.)
Line 56: Line 99:
Unfortunately, for shells other than Bash, there is no syntax for ''evaluating'' a referenced variable. You would have to use [#faq48 eval], which means you would have to undergo extreme measures to sanitize your data to avoid catastrophe.

Sometimes it's convenient to have associative arrays, arrays indexed by a string. Awk has associative arrays. Perl calls them "hashes", while Tcl simply calls them "arrays". KornShell93 supports this kind of array:
Finally, some people just ''cannot'' resist throwing `eval` into the picture:
Line 61: Line 102:
 # KornShell93 script - does not work with BASH
 typeset -A homedir # Declare KornShell93 associative array
 # Bourne
 ref=myVar
 eval "$ref=\$value"
 }}}

This expands to the statement that is executed:

 {{{
 myVar=$value
 }}}

The right-hand side is not parsed by the shell, so there is no danger of unwanted side effects. The drawback, here, is that every single shell metacharacter on the right hand side of the `=` must be escaped carefully. In the example shown here, there was only one. In a more complex situation, there could be dozens.

The good news is that if you can sanitize the right hand side correctly, this trick is fully portable, has no variable scope issues, and allows all content including newlines. The bad news is that if you fail to sanitize the right hand side correctly, you have a massive security hole. Use `eval` at your own risk.

=== Associative Arrays ===
Sometimes it's convenient to have associative arrays, arrays indexed by a string. Awk has associative arrays. Perl calls them "hashes", while Tcl simply calls them "arrays". [[KornShell|ksh93]] supports this kind of array:

 {{{
 # ksh93
 typeset -A homedir # Declare ksh93 associative array
Line 70: Line 130:
 done}}}  done
 
}}}
Line 72: Line 133:
BASH (including version 3.x) does not support them, unfortunately. Either use [#faq48 eval] after sanitizing your data, or switch to awk, perl, ksh93, tcl, etc. BASH version 4.0 finally supports them, though older versions do not.

 {{{
 # bash 4.0
 declare -A homedir
 homedir[jim]=/home/jim
 ... (same as the ksh93 example, other than declare vs. typeset)
 }}}

If you can't use ksh93 or bash 4.0, consider switching to awk, perl, ksh93, tcl, etc. if you need this type of data structure to solve your problem.

Before you think of using `eval` to mimic this behavior in an older shell (probably by creating a set of variable names like `homedir_alex`), try to think of a simpler approach that you could use instead. If this hack still seems to be the best thing to do, have a look at the following disadvantages:

 1. It's hard to read and to maintain.
 1. The variable names must match the RegularExpression {{{^[a-zA-Z_][a-zA-Z_0-9]*}}} -- i.e., a variable name cannot contain arbitrary characters but only letters, digits, and underscores. We cannot have a variable's name contain Unix usernames, for instance -- consider a user named {{{hong-hu}}}. A dash '-' cannot be part of a variable name, so the entire attempt to make a variable named `homedir_hong-hu` is doomed from the start.
 1. Quoting is hard to get right. If content strings (not variable names) can contain whitespace characters and quotes, it's hard to quote it right to preserve it through both shell parsings. And that's just for ''constants'', known at the time you write the program.
 1. If the program handles unsanitized user input, it can be [[BashFAQ/048|VERY dangerous]]!

----
CategoryShell

How can I use variable variables (indirect variables, pointers, references) or associative arrays?

There are two halves to this: evaluating variables, and assigning values. We'll take each half separately:

Evaluating indirect/reference variables

BASH allows you to expand a parameter indirectly -- that is, one variable may contain the name of another variable:

  •  # Bash
     realvariable=contents
     ref=realvariable
     echo "${!ref}"   # prints the contents of the real variable

KornShell (ksh93) has a completely different, more powerful syntax -- the nameref command (also known as typeset -n):

  •  # ksh93
     realvariable=contents
     nameref ref=realvariable
     echo "$ref"      # prints the contents of the real variable

ksh93's nameref allows us to work with references to arrays, as well as regular scalar variables. For example,

  •  # ksh93
     myfunc() {
       nameref ref=$1
       echo "array $1 has ${#ref[*]} elements"
     }
     realarray=(...)
     myfunc realarray

We are not aware of any trick that can duplicate that functionality in Bash, POSIX or Bourne shells (short of using eval, which is extremely difficult to do securely).

Unfortunately, for shells other than Bash and ksh93, there is no syntax for evaluating a referenced variable. You would have to use eval, which means you would have to undergo extreme measures to sanitize your data to avoid catastrophe.

Assigning indirect/reference variables

Assigning a value "through" a reference (or pointer, or indirect variable, or whatever you want to call it -- I'm going to use "ref" from now on) is more widely possible, but the means of doing so are extremely shell-specific.

In ksh93, we can just use nameref again:

  •  # ksh93
     nameref ref=realvariable
     ref="contents"
     # realvariable now contains the string "contents"

In Bash, we can use read and Bash's here string syntax:

  •  # Bash
     ref=realvariable
     IFS= read -r $ref <<< "contents"
     # realvariable now contains the string "contents"

However, this only works if there are no newlines in the content. If you need to assign multiline values, keep reading.

A similar trick works for Bash array variables too:

  •  # Bash
     aref=realarray
     read -r -a $aref <<< "words go into array elements"
     echo "${realarray[1]}"   # prints "go"

(Again, newlines in the input will break this trick. IFS is used to delimit words, so you may or may not need to set that.)

Another trick is to use Bash's printf -v (only available in recent versions):

  •  # Bash 3.1 or higher
     ref=realvariable
     printf -v $ref %s "contents"

The printf -v trick is handy if your contents aren't a constant string, but rather, something dynamically generated. You can use all of printf's formatting capabilities. This trick also permits any string content, including embedded newlines (but not NUL bytes - no force in the universe can put NUL bytes into shell strings usefully). This is the best trick to use if you're in bash 3.1 or higher.

Yet another trick is Korn shell's typeset or Bash's declare. These are roughly equivalent to each other. Both of them cause a variable to become locally scoped to a function, if used inside a function; but if used outside a function, they can operate on global variables.

  •  # Korn shell (all versions):
     typeset $ref="contents"
    
     # Bash:
     declare $ref="contents"

The advantage of using typeset or declare over eval is that the right hand side of the assignment is not parsed by the shell. If you used eval here, you would have to sanitize/escape the entire right hand side first. This trick also preserves the contents exactly, including newlines, so this is the best trick to use if you're in bash older than 3.1 (or ksh88) and don't need to worry about accidentally changing your variable's scope (i.e., you're not using it inside a function).

If you aren't using Bash or Korn shell, you can do assignments to referenced variables using here document syntax:

  •  # Bourne
     ref=realvariable
     read $ref <<EOF
     contents
     EOF

(Alas, read means we're back to only getting at most one line of content. This is the most portable trick, but it's limited to single-line content.)

Remember that, when using a here document, if the sentinel word (EOF in our example) is unquoted, then parameter expansions will be performed inside the body. If the sentinel is quoted, then parameter expansions are not performed. Use whichever is more convenient for your task.

Finally, some people just cannot resist throwing eval into the picture:

  •  # Bourne
     ref=myVar
     eval "$ref=\$value"

This expands to the statement that is executed:

  •  myVar=$value

The right-hand side is not parsed by the shell, so there is no danger of unwanted side effects. The drawback, here, is that every single shell metacharacter on the right hand side of the = must be escaped carefully. In the example shown here, there was only one. In a more complex situation, there could be dozens.

The good news is that if you can sanitize the right hand side correctly, this trick is fully portable, has no variable scope issues, and allows all content including newlines. The bad news is that if you fail to sanitize the right hand side correctly, you have a massive security hole. Use eval at your own risk.

Associative Arrays

Sometimes it's convenient to have associative arrays, arrays indexed by a string. Awk has associative arrays. Perl calls them "hashes", while Tcl simply calls them "arrays". ksh93 supports this kind of array:

  •  # ksh93
     typeset -A homedir             # Declare ksh93 associative array
     homedir[jim]=/home/jim
     homedir[silvia]=/home/silvia
     homedir[alex]=/home/alex
     
     for user in ${!homedir[@]}     # Enumerate all indices (user names)
     do
         echo "Home directory of user $user is ${homedir[$user]}"
     done

BASH version 4.0 finally supports them, though older versions do not.

  •  # bash 4.0
     declare -A homedir
     homedir[jim]=/home/jim
     ... (same as the ksh93 example, other than declare vs. typeset)

If you can't use ksh93 or bash 4.0, consider switching to awk, perl, ksh93, tcl, etc. if you need this type of data structure to solve your problem.

Before you think of using eval to mimic this behavior in an older shell (probably by creating a set of variable names like homedir_alex), try to think of a simpler approach that you could use instead. If this hack still seems to be the best thing to do, have a look at the following disadvantages:

  1. It's hard to read and to maintain.
  2. The variable names must match the RegularExpression ^[a-zA-Z_][a-zA-Z_0-9]* -- i.e., a variable name cannot contain arbitrary characters but only letters, digits, and underscores. We cannot have a variable's name contain Unix usernames, for instance -- consider a user named hong-hu. A dash '-' cannot be part of a variable name, so the entire attempt to make a variable named homedir_hong-hu is doomed from the start.

  3. Quoting is hard to get right. If content strings (not variable names) can contain whitespace characters and quotes, it's hard to quote it right to preserve it through both shell parsings. And that's just for constants, known at the time you write the program.

  4. If the program handles unsanitized user input, it can be VERY dangerous!


CategoryShell

BashFAQ/006 (last edited 2023-04-14 06:52:11 by ormaaj)