Differences between revisions 2 and 20 (spanning 18 versions)
Revision 2 as of 2008-10-03 16:15:38
Size: 3515
Editor: GreyCat
Comment: next number in sequence
Revision 20 as of 2017-05-18 19:13:44
Size: 6073
Editor: 202
Comment: Fix syntax error
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
## page was renamed from BashFAQ/100
## Please edit system and help pages ONLY in the moinmaster wiki! For more
## information, please see MoinMaster:MoinPagesEditorGroup.
##acl MoinPagesEditorGroup:read,write,delete,revert All:read
##master-page:HelpTemplate
##master-date:Unknown-Date
#format wiki
Line 9: Line 2:
== "Self modifying code OR using Bash with a cgi script ==
It is a well-known and widely accepted principle that self modifying code is dangerous, and should never be done.
<<Anchor(faq92)>>
== How do I write a CGI script that accepts parameters? ==
There are always circumstances beyond our control that drive us to do things that we would never choose to do on our own. This FAQ entry describes one of those situations.
Line 12: Line 6:
Many years ago, when computers had fewer resources and code had to be more compact, very bright individuals would often use self modifying code to create elegant, if somewhat bizarre, solutions to the problem of creating programs to fit in very small spaces. A [[http://hoohoo.ncsa.uiuc.edu/cgi/|CGI]] program can be invoked with parameters, sent by the web browser (user agent). There are (at least) two ways to invoke a CGI program: the "GET" method and the "POST" method. In the "GET" method, parameters are provided to the CGI program in an environment variable called `QUERY_STRING`. The parameters take the form of KEY=VALUE definitions (e.g. `user=george`), with some characters encoded in hexadecimal, spaces encoded as plus signs, all joined together with ampersands. In the "POST" method, the parameters are provided on standard input instead.
Line 14: Line 8:
We no longer have that issue. Computers today have, in most cases, abundant resources. Now of course we know you would never write a CGI script in Bash. So for the purposes of this entry we will assume that terrorists have kidnapped your spouse and children and will torture, maim, kill, "or worse" them if you do not comply with their demands to write such a script.
Line 16: Line 10:
However there are always circumstances beyond our control that drive us to do things that we would never choose to do on our own. (The "or worse" situation would clearly be something like being forced to use Microsoft based software.)
Line 18: Line 12:
This FAQ entry describes one of those situations. So, given a `QUERY_STRING` variable, we would like to extract the keys (variables) and their values, so that we can use them in the script.
Line 20: Line 14:
The problem encountered is a situation where a Web based form invokes a CGI-bin script that is written in BASH. === Associative Arrays ===
Line 22: Line 16:
Now of course we know you would never write a CGI-bin script in BASH. So for the purposes of this entry we will assume that terrorists have kidnapped your wife and children and will torture, maim, kill, "or worse" them if you do not comply with their demands to write such a script. The best approach is to place the key/value pairs into an associative array. Associative arrays are available in ksh93 and in bash 4.0, but not in POSIX or Bourne shells. They are ''designed'' to hold key/value pairs where the keys can be arbitrary strings, so they seem appropriate for this job.
Line 24: Line 18:
(The or worse situation would clearly be something like being forced to use Microsoft based software) {{{
# Bash 4+
Line 26: Line 21:
The quick and easy way to process the string of variable assignments that are passed in to a CGI script, is to use the eval command to process those assignments. However as we all know the use of eval is "STRONGLY DISCOURAGED". That is to say we always avoid using eval if there is any way around it. # Read in the cgi input string
if [[ $QUERY_STRING ]]; then
  query=$QUERY_STRING
else
  read -r query
fi
Line 28: Line 28:
This is the old way, which is remarkably unsafe: # Set up an associative array to hold the query parameters.
declare -A params
Line 30: Line 31:
# read in the cgi input string
read foo
# Iterate through the key=value+%41%42%43 elements.
# Separate key and value, and perform decoding on the value.
while IFS='=' read -r -d '&' key value; do
Line 33: Line 35:
#convert some of the encoded strings and things like "&" (left as an exercise for the reader)     # Decoding steps:
    # 1. turn \ to \\. Step 4 will change them back to \
    # 2. plus signs become spaces.
    # 3. percent signs become \x.
    # 4. run it through printf %b which will expand the \-escapes
Line 35: Line 41:
#run eval on the string
eval $foo
    value=${value//\\/\\\\}
    value=${value//+/ }
    value=${value//'%'/\\x}
    printf -v 'params[$key]' %b "$value"
done <<< "$query&"
Line 38: Line 47:
#sit back and discover that the user had put "/bin/rm -rf /" in one of the web form fields, which even if not root will do damage to some part of the file system. Another dangerous string would be a fork bomb. # Now we can use the parameters from the associative array named params.
# If we need a list of the keys, it's "${!params[@]}".
}}}
Line 40: Line 51:
the safer way: The `printf -v varname` option is available in every [[BashFAQ/061|version of bash]] that supports associative arrays, so we may use it here. It's much more efficient than calling a SubShell. We've also avoided the potential problems with `echo -e` if the `value` happens to be something like `-n`.
Line 42: Line 53:
# read in the cgi input string
read foo
Technically, the CGI specification allows multiple instances of the same key in a single query. For example, `group=managers&member=Alice&member=Charlie` is a perfectly legitimate query string. None of the approaches on this page handle this case (at least not in what we'd probably consider the "correct" way). Fortunately, it's not often that you'd write a CGI like this; and in any case, you're not being forced to use bash for this task.
The quick, easy and dangerous way to process the `QUERY_STRING` is to convert the `&`s to `;`s and then use the `eval` command to run those assignments. However, the use of `eval` is [[BashFAQ/048|STRONGLY DISCOURAGED]]. That is to say we always avoid using `eval` if there is any way around it.
Line 45: Line 56:
#convert some of the encoded strings and things like "&" (left as an exercise for the reader) === Older Bash Shells ===
Line 47: Line 58:
# in this case the variable foo below is being given the string after conversion of the encoded string
# so you can have an example that really works.
If you don't have associative arrays, don't just leap to `eval`. A better approach is to extract each variable/value pair, and assign them to shell variables, one by one, without executing them. This requires an [[BashFAQ/006|indirect variable assignment]], which means using some shell-specific trickery. We'll write this using Bash syntax; converting to ksh or Bourne shell is left as an exercise.
Line 50: Line 60:
foo='uname=John+smith;email=john.smith@johnsmith.com;phone=999-999-9999;asst=John+smith;aemail=john.smith@ohnsmith.com;aphone=999-999-9999;teamclass=BU14;c1day1=M;c1T1=5:00;c1day2=W;c1T2=5:00;c2day1=T;c2T1=6:30;c2day2=W;c2T2=6:30;c3day1=W;c3T1=6:30;c3day2=F;c3T2=5:00;ADDBOX=;' {{{
# Bash 3.1 +
Line 52: Line 63:
IFS=';'
read -a arr <<< "$foo";
for i in "${arr[@]}"; do
    declare "${i}";
done;
# Read in the cgi input string
if [[ $QUERY_STRING ]]; then
  query=$QUERY_STRING
else
  read -r query
fi
Line 58: Line 70:
echo $uname # Variable names in bash are limited to ASCII alphanumerics and underscores
sanitize() {
    local LC_ALL=C # to only consider ASCII letters
    printf %s "${1//[![:alnum:]_]/_}"
}
Line 60: Line 76:
# query contains something like name=Fred+Flintstone&city=Bedrock
# Treat this as a list of key=value expressions joined with &.
# Iterate through the list and perform each assignment.
Line 61: Line 80:
While this might be a little less clear, it avoids this huge security problem that eval has, that of executing any arbitrary command the user might care to enter into the Web form. Clearly more desirable to do it this way. while IFS='=' read -r -d '&' var value; do
    # To be sure the resulting variable name is valid, add "get_"
    # in front, and replace any invalid characters with underscores.
    # 1foo-bar => get_1foo_bar
    var=$(sanitize "get_$var")
Line 63: Line 86:
NOTE- this example specifically relies on the ";" being used to seperate the variable assignments in the CGI input string - In order for that to happen, YOU MUST convert the "&" chars into ";" chars.
 
    value=${value//\\/\\\\}
    value=${value//+/ }
    value=${value//'%'/\\x}
    printf -v "$var" %b "$value"
done <<< "$query&"
Line 66: Line 92:
Thissolution was published in the channel by trash, ans was pointed out by lhunath # Now you can do whatever you wanted to do with "get_name".
# If we need a list of the keys, it's "${!get_@}".
}}}

While this might be a little less clear, it avoids this huge security problem that `eval` has: executing any arbitrary command the user might care to enter into the web form. Clearly this is an improvement.

=== The Wrong Way ===

{{{
# DO NOT DO THIS!
#
# Read in the cgi input string
if [ "$QUERY_STRING" ]; then
  query=$QUERY_STRING
else
  read query
fi

# Convert some of the encoded strings and things like "&" (left as an exercise for the reader)

# Run eval on the string
eval "$query"

# Sit back and discover that the user has put "/bin/rm -rf /" in one of the web form fields,
# which even if not root will do damage to some part of the file system.
# Another dangerous string would be a fork bomb.
}}}

The only reason this example is still on this page is because whenever we delete bad examples, someone rewrites them. So, this is your bad example, and your multiple layers of warnings '''not''' to use it.

How do I write a CGI script that accepts parameters?

There are always circumstances beyond our control that drive us to do things that we would never choose to do on our own. This FAQ entry describes one of those situations.

A CGI program can be invoked with parameters, sent by the web browser (user agent). There are (at least) two ways to invoke a CGI program: the "GET" method and the "POST" method. In the "GET" method, parameters are provided to the CGI program in an environment variable called QUERY_STRING. The parameters take the form of KEY=VALUE definitions (e.g. user=george), with some characters encoded in hexadecimal, spaces encoded as plus signs, all joined together with ampersands. In the "POST" method, the parameters are provided on standard input instead.

Now of course we know you would never write a CGI script in Bash. So for the purposes of this entry we will assume that terrorists have kidnapped your spouse and children and will torture, maim, kill, "or worse" them if you do not comply with their demands to write such a script.

(The "or worse" situation would clearly be something like being forced to use Microsoft based software.)

So, given a QUERY_STRING variable, we would like to extract the keys (variables) and their values, so that we can use them in the script.

Associative Arrays

The best approach is to place the key/value pairs into an associative array. Associative arrays are available in ksh93 and in bash 4.0, but not in POSIX or Bourne shells. They are designed to hold key/value pairs where the keys can be arbitrary strings, so they seem appropriate for this job.

# Bash 4+

# Read in the cgi input string
if [[ $QUERY_STRING ]]; then
  query=$QUERY_STRING
else
  read -r query
fi

# Set up an associative array to hold the query parameters.
declare -A params

# Iterate through the key=value+%41%42%43 elements.
# Separate key and value, and perform decoding on the value.
while IFS='=' read -r -d '&' key value; do

    # Decoding steps: 
    # 1. turn \ to \\. Step 4 will change them back to \
    # 2. plus signs become spaces.
    # 3. percent signs become \x.
    # 4. run it through printf %b which will expand the \-escapes

    value=${value//\\/\\\\}
    value=${value//+/ }
    value=${value//'%'/\\x}
    printf -v 'params[$key]' %b "$value"
done <<< "$query&"

# Now we can use the parameters from the associative array named params.
# If we need a list of the keys, it's "${!params[@]}".

The printf -v varname option is available in every version of bash that supports associative arrays, so we may use it here. It's much more efficient than calling a SubShell. We've also avoided the potential problems with echo -e if the value happens to be something like -n.

Technically, the CGI specification allows multiple instances of the same key in a single query. For example, group=managers&member=Alice&member=Charlie is a perfectly legitimate query string. None of the approaches on this page handle this case (at least not in what we'd probably consider the "correct" way). Fortunately, it's not often that you'd write a CGI like this; and in any case, you're not being forced to use bash for this task. The quick, easy and dangerous way to process the QUERY_STRING is to convert the &s to ;s and then use the eval command to run those assignments. However, the use of eval is STRONGLY DISCOURAGED. That is to say we always avoid using eval if there is any way around it.

Older Bash Shells

If you don't have associative arrays, don't just leap to eval. A better approach is to extract each variable/value pair, and assign them to shell variables, one by one, without executing them. This requires an indirect variable assignment, which means using some shell-specific trickery. We'll write this using Bash syntax; converting to ksh or Bourne shell is left as an exercise.

# Bash 3.1 +

# Read in the cgi input string
if [[ $QUERY_STRING ]]; then
  query=$QUERY_STRING
else
  read -r query
fi

# Variable names in bash are limited to ASCII alphanumerics and underscores
sanitize() {
    local LC_ALL=C  # to only consider ASCII letters
    printf %s "${1//[![:alnum:]_]/_}"
}

# query contains something like name=Fred+Flintstone&city=Bedrock
# Treat this as a list of key=value expressions joined with &.
# Iterate through the list and perform each assignment.

while IFS='=' read -r -d '&' var value; do
    # To be sure the resulting variable name is valid, add "get_" 
    # in front, and replace any invalid characters with underscores.
    # 1foo-bar => get_1foo_bar
    var=$(sanitize "get_$var")

    value=${value//\\/\\\\}
    value=${value//+/ }
    value=${value//'%'/\\x}
    printf -v "$var" %b "$value"
done <<< "$query&"

# Now you can do whatever you wanted to do with "get_name".
# If we need a list of the keys, it's "${!get_@}".

While this might be a little less clear, it avoids this huge security problem that eval has: executing any arbitrary command the user might care to enter into the web form. Clearly this is an improvement.

The Wrong Way

# DO NOT DO THIS!
#
# Read in the cgi input string
if [ "$QUERY_STRING" ]; then
  query=$QUERY_STRING
else
  read query
fi

# Convert some of the encoded strings and things like "&" (left as an exercise for the reader)

# Run eval on the string
eval "$query"

# Sit back and discover that the user has put "/bin/rm -rf /" in one of the web form fields,
# which even if not root will do damage to some part of the file system.
# Another dangerous string would be a fork bomb.

The only reason this example is still on this page is because whenever we delete bad examples, someone rewrites them. So, this is your bad example, and your multiple layers of warnings not to use it.

BashFAQ/092 (last edited 2017-05-18 19:13:44 by 202)