Differences between revisions 11 and 13 (spanning 2 versions)
Revision 11 as of 2009-10-16 07:57:22
Size: 2397
Editor: pgas
Comment: the echo -en version seems to have some bugs
Revision 13 as of 2009-10-16 08:28:39
Size: 3432
Editor: pgas
Comment: add some more versions without subshells
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
Line 12: Line 13:
Line 16: Line 18:
 
Line 20: Line 22:
 
Line 28: Line 30:
   hex() {     hex() {
Line 37: Line 39:
 
Line 41: Line 43:
Line 44: Line 45:
 ''Tricky? Rather, it's using a feature that I can't find documented anywhere -- putting a single quote in front of an integer. Neat effect, but how on '''earth''' did you find out about it? Source diving? -- GreyCat''

 
''It validates The Single Unix Specification: "If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote." (see [[http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html|printf()]] to know more) -- mjf''
 . ''Tricky? Rather, it's using a feature that I can't find documented anywhere -- putting a single quote in front of an integer. Neat effect, but how on '''earth''' did you find out about it? Source diving? -- GreyCat''
  . ''It validates The Single Unix Specification: "If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote." (see [[http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html|printf()]] to know more) -- mjf''
Line 53: Line 53:
Line 66: Line 65:
     This version is faster as it executes without a subshell, it seems to only work strictly on ascii chars <127 while the printf version is happy with chars up to 255 and also for only a subset of ascii ie >64 decimal.
Some versions avoiding a subshell:
{{{
oldchr () { printf \\$(printf '%03o' $1) ;}

#posix
chr () {
    set -- $(($1 / 64)) $(($1 % 64))
    set -- $1 $(($2 / 8)) $(($2 % 8))
    printf \\"${1}${2}${3}"
}

#bash only
chr_bash () {
    local temp
    printf -v temp '%03o' $1
    printf \\$temp
}

#test
for i in {1..255} ;do [[ "$(oldchr $i)" = "$(chr $i)" ]] || echo $i;done
for i in {1..255} ;do [[ "$(oldchr $i)" = "$(chr_bash $i)" ]] || echo $i;done
for p in oldchr chr chr_bash; do echo $p:;time for i in {1..4000}; do $p 65 >/dev/null; done; done

}}}
the timings:
{{{
$ bash chr
oldchr:

real 0m14.350s
user 0m5.004s
sys 0m9.248s
chr:

real 0m0.422s
user 0m0.059s
sys 0m0.216s
chr_bash:

real 0m0.400s
user 0m0.042s
sys 0m0.189s

}}}

How do I convert an ASCII character to its decimal (or hexadecimal) value and back?

If you have a known octal or hexadecimal value (at script-writing time), you can just use printf:

   # POSIX
   printf '\x27\047\n'

This prints two literal ' characters (27 is the hexadecimal ASCII value of the character, and 47 is the octal value) and a newline.

If you need to convert characters (or numeric ASCII values) that are not known in advance (i.e., in variables), you can use something a little more complicated:

   # POSIX
   # chr() - converts decimal value to its ASCII character representation
   # ord() - converts ASCII character to its decimal value

   chr() {
     printf \\$(printf '%03o' $1)
   }

   ord() {
     printf '%d' "'$1"
   }

   # hex() - converts ASCII character to a hexadecimal value
   # unhex() - converts a hexadecimal value to an ASCII character

   hex() {
      printf '%x' "'$1"
   }

   unhex() {
      printf \\x"$1"
   }

   # examples:

   chr $(ord A)    # -> A
   ord $(chr 65)   # -> 65

The ord function above is quite tricky.

  • Tricky? Rather, it's using a feature that I can't find documented anywhere -- putting a single quote in front of an integer. Neat effect, but how on earth did you find out about it? Source diving? -- GreyCat

    • It validates The Single Unix Specification: "If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote." (see printf() to know more) -- mjf

This version of chr executes much faster than the printf version above (about 1/40 to less than 1/150 the time when run in a loop):

   chr() { echo -en "\0$(( $1 % 8 + 10 * ( $1 / 8 ) + 20 ))"; }

   for p in chr newchr; do time for i in {1..4000}; do $p 65 >/dev/null; done; done

   System1                     System2
   real    0m46.824s           real    1m33.814s
   user    0m4.624s            user    0m8.540s
   sys     0m33.290s           sys     1m23.978s

   real    0m1.340s            real    0m0.512s
   user    0m1.096s            user    0m0.389s
   sys     0m0.124s            sys     0m0.096s

This version is faster as it executes without a subshell, it seems to only work strictly on ascii chars <127 while the printf version is happy with chars up to 255 and also for only a subset of ascii ie >64 decimal. Some versions avoiding a subshell:

oldchr () {  printf \\$(printf '%03o' $1) ;}

#posix
chr () {
    set -- $(($1 / 64)) $(($1 % 64))
    set -- $1  $(($2 / 8)) $(($2 % 8))
    printf \\"${1}${2}${3}"
}

#bash only
chr_bash () {
    local temp
    printf -v temp  '%03o' $1
    printf \\$temp
}

#test
for i in {1..255} ;do [[ "$(oldchr $i)" = "$(chr $i)" ]] || echo $i;done
for i in {1..255} ;do [[ "$(oldchr $i)" = "$(chr_bash $i)" ]] || echo $i;done
for p in oldchr chr chr_bash; do echo $p:;time for i in {1..4000}; do $p 65 >/dev/null; done; done

the timings:

$ bash  chr
oldchr:

real    0m14.350s
user    0m5.004s
sys     0m9.248s
chr:

real    0m0.422s
user    0m0.059s
sys     0m0.216s
chr_bash:

real    0m0.400s
user    0m0.042s
sys     0m0.189s

BashFAQ/071 (last edited 2021-02-08 16:03:51 by GreyCat)