2537
Comment: screw it. put the egrep one FIRST, because it actually works. insert warnings. leave the bash ones in, after the warnings.
|
4659
bashfaq is brittle about empty lines
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
[[Anchor(faq54)]] | <<Anchor(faq54)>> |
Line 3: | Line 3: |
First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point. | |
Line 4: | Line 5: |
First, you have to define what you mean by "number". The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign". | === Hand parsing === If you're validating a simple "string of digits", you can do it with a [[glob]]: |
Line 7: | Line 9: |
if [[ $foo = *[^0-9]* ]]; then echo "'$foo' has a non-digit somewhere in it" |
# Bash if [[ $foo != *[!0-9]* ]]; then echo "'$foo' is strictly numeric" |
Line 10: | Line 13: |
echo "'$foo' is strictly numeric" | echo "'$foo' has a non-digit somewhere in it" |
Line 13: | Line 16: |
This can be done in Korn and legacy Bourne shells as well, using {{{case}}}: |
The same thing can be done in Korn and POSIX shells as well, using {{{case}}}: |
Line 17: | Line 19: |
case "$foo" in *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;; *) echo "'$foo' is strictly numeric" ;; |
# POSIX case $var in '') printf 'var is empty\n';; *[!0-9]*) printf '%s has a non-digit somewhere in it\n' "$var";; *) printf '%s is strictly numeric\n' "$var";; esac >&2 }}} Of course, if all you care about is vald vs. invalid, you can combine cases: {{{ # POSIX case $var in '' | *[!0-9]*) echo "$0: $var: invalid digit" >&2; exit 1;; |
Line 22: | Line 38: |
If what you actually mean is "a valid floating-point number" or something else more complex, then you might prefer to use a regular expression. Here is a portable version, using {{{egrep}}}: |
If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but you can trim off any sign and then compare: |
Line 26: | Line 41: |
if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]+)?$' >/dev/null; then echo "'$foo' might be a number" else echo "'$foo' might not be a number" |
# POSIX case ${var#[-+]} in # notice ${var#prefix} substitution to trim sign '') printf 'var is empty\n';; *.*.*) printf '%s has more than one decimal point in it\n' "$var";; *[!0-9.]*) printf '%s has a non-digit somewhere in it\n' "$var";; *) printf '%s looks like a valid float\n' "$var";; esac >&2 }}} Or in Bash, we can use [[glob|extended globs]]: {{{ # Bash -- extended globs must be enabled explicitly in versions prior to 4.1. # Check whether the variable is all digits. shopt -s extglob [[ $var = +([0-9]) ]] }}} A more complex case: {{{ # Bash / ksh shopt -s extglob if [[ $foo = @(*[0-9]*|!([+-]|)) && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then echo 'foo is a floating-point number' |
Line 32: | Line 71: |
Optionally, `case..esac` may have been used in shells with extended pattern matching. The leading test of {{{$foo}}} is to ensure that it contains at least one digit, isn't empty, and contains more than just + or - by itself. | |
Line 33: | Line 73: |
The leading test of {{{"$foo"}}} is to ensure that it is not an empty string. (An empty string would satisfy the regex, and changing the regex to avoid that is not worth the effort.) Bash version 3 and above have regular expression support in the [[ command. However, due to serious bugs and syntax changes in Bash's [[ regex support, we '''do not recommend''' using it. Nevertheless, if I simply omit all Bash regex answers here, someone will come along and fill them in -- and they probably won't work, or won't contain all the caveats necessary. So, in the interest of preventing disasters, here are the Bash regex answers that you should not use. |
If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's [[RegularExpression|regular expression]] syntax. Here is a portable version (explained in detail [[http://www.wplug.org/wiki/Meeting-20100612#EXERCISE_TWO|here]]), using {{{egrep}}}: |
Line 38: | Line 76: |
if [[ $foo && $foo =~ ^[-+]?[0-9]*\(\.[0-9]+\)?$ ]]; then # Bash 3.1 only! | # Bourne if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then echo "'$foo' is a number" else echo "'$foo' is not a number" fi }}} Bash version 3 and above have regular expression support in the [[ command. {{{ # Bash # The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2 regexp='^[-+]?[0-9]*(\.[0-9]*)?$' if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then |
Line 44: | Line 96: |
=== Using the parsing done by [ and printf (or "using eq") === {{{ # fails with ksh if [ "$foo" -eq "$foo" ] 2>/dev/null; then echo "$foo is an integer" fi }}} `[` parses the variable and interprets it as an integer because of the `-eq`. If the parsing succeeds the test is trivially true; if it fails `[` prints an error message that `2>/dev/null` hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression. |
|
Line 45: | Line 105: |
Unfortunately, bash changed the syntax of its regular expression support after version 3.1, so the following ''may'' work in some patched versions of bash 3.2: | Be careful: the following trick with `printf` (no supported by all shells) |
Line 48: | Line 108: |
if [[ $foo && $foo =~ ^[-+]?[0-9]*(\.[0-9]+)?$ ]]; then # **PATCHED** Bash 3.2 only! echo "'$foo' looks rather like a number" else echo "'$foo' doesn't look particularly numeric to me" |
# BASH if printf %f "$foo" >/dev/null 2>&1; then echo "$foo is a float" |
Line 54: | Line 113: |
is broken: about the arguments of the{{{ a}}}, {{{A}}}, {{{e}}}, {{{E}}}, {{{f}}}, {{{F}}}, {{{g}}}, or {{{G}}} format modifiers, POSIX specifies that ''if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote.'' Hence this fails when {{{foo}}} expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float. | |
Line 55: | Line 115: |
It fails rather spectacularly in bash 3.1 and in bash 3.2 without patches. Note that the parentheses in the {{{egrep}}} regular expression and the bash 3.2.patched regular expression don't require backslashes in front of them, whereas the ones in the bash 3.1 command do. |
You can use `%d` to parse an integer. Take care that the parsing might be (is supposed to be?) [[locale]]-dependent. |
How can I tell whether a variable contains a valid number?
First, you have to define what you mean by "number". The most common case when people ask this seems to be "a non-negative integer, with no leading + sign". Or in other words, a string of all digits. Other times, people want to validate a floating-point input, with optional sign and optional decimal point.
Hand parsing
If you're validating a simple "string of digits", you can do it with a glob:
# Bash if [[ $foo != *[!0-9]* ]]; then echo "'$foo' is strictly numeric" else echo "'$foo' has a non-digit somewhere in it" fi
The same thing can be done in Korn and POSIX shells as well, using case:
# POSIX case $var in '') printf 'var is empty\n';; *[!0-9]*) printf '%s has a non-digit somewhere in it\n' "$var";; *) printf '%s is strictly numeric\n' "$var";; esac >&2
Of course, if all you care about is vald vs. invalid, you can combine cases:
# POSIX case $var in '' | *[!0-9]*) echo "$0: $var: invalid digit" >&2; exit 1;; esac
If you need to allow a leading negative sign, or if want a valid floating-point number or something else more complex, then there are a few possible ways. Standard globs aren't expressive enough to do this, but you can trim off any sign and then compare:
# POSIX case ${var#[-+]} in # notice ${var#prefix} substitution to trim sign '') printf 'var is empty\n';; *.*.*) printf '%s has more than one decimal point in it\n' "$var";; *[!0-9.]*) printf '%s has a non-digit somewhere in it\n' "$var";; *) printf '%s looks like a valid float\n' "$var";; esac >&2
Or in Bash, we can use extended globs:
# Bash -- extended globs must be enabled explicitly in versions prior to 4.1. # Check whether the variable is all digits. shopt -s extglob [[ $var = +([0-9]) ]]
A more complex case:
# Bash / ksh shopt -s extglob if [[ $foo = @(*[0-9]*|!([+-]|)) && $foo = ?([+-])*([0-9])?(.*([0-9])) ]]; then echo 'foo is a floating-point number' fi
Optionally, case..esac may have been used in shells with extended pattern matching. The leading test of $foo is to ensure that it contains at least one digit, isn't empty, and contains more than just + or - by itself.
If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use an external tool's regular expression syntax. Here is a portable version (explained in detail here), using egrep:
# Bourne if echo "$foo" | grep -qE '^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$'; then echo "'$foo' is a number" else echo "'$foo' is not a number" fi
Bash version 3 and above have regular expression support in the [[ command.
# Bash # The regexp must be stored in a var and expanded for backward compatibility with versions < 3.2 regexp='^[-+]?[0-9]*(\.[0-9]*)?$' if [[ $foo = *[0-9]* && $foo =~ $regexp ]]; then echo "'$foo' looks rather like a number" else echo "'$foo' doesn't look particularly numeric to me" fi
Using the parsing done by [ and printf (or "using eq")
# fails with ksh if [ "$foo" -eq "$foo" ] 2>/dev/null; then echo "$foo is an integer" fi
[ parses the variable and interprets it as an integer because of the -eq. If the parsing succeeds the test is trivially true; if it fails [ prints an error message that 2>/dev/null hides and sets a status different from 0. However this method fails if the shell is ksh, because ksh evaluates the variable as an arithmetic expression.
Be careful: the following trick with printf (no supported by all shells)
# BASH if printf %f "$foo" >/dev/null 2>&1; then echo "$foo is a float" fi
is broken: about the arguments of the a, A, e, E, f, F, g, or G format modifiers, POSIX specifies that if the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote. Hence this fails when foo expands to a string with a leading single-quote or double-quote: the previous command will happily validate the string as a float.
You can use %d to parse an integer. Take care that the parsing might be (is supposed to be?) locale-dependent.