Differences between revisions 1 and 14 (spanning 13 versions)
Revision 1 as of 2007-05-03 00:06:43
Size: 1714
Editor: redondos
Comment:
Revision 14 as of 2009-10-16 10:13:35
Size: 3544
Editor: pgas
Comment: yet another chr
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
[[Anchor(faq71)]] <<Anchor(faq71)>>
Line 3: Line 4:

This task is quite easy while using the {{{printf}}} builtin. You can either write two simple functions as shown below or use the plain {{{printf}}} constructions alone.
If you have a known octal or hexadecimal value (at script-writing time), you can just use `printf`:
Line 7: Line 7:
   # POSIX
   printf '\x27\047\n'
}}}
This prints two literal ' characters (27 is the hexadecimal ASCII value of the character, and 47 is the octal value) and a newline.

If you need to convert characters (or numeric ASCII values) that are not known in advance (i.e., in variables), you can use something a little more complicated:

{{{
   # POSIX
Line 9: Line 18:
 
Line 13: Line 22:
 
Line 18: Line 27:
   hex() {    # hex() - converts ASCII character to a hexadecimal value
   # unhex() - converts a hexadecimal value to an ASCII character

   hex() {
Line 22: Line 34:
   unhex() {
      printf \\x"$1"
   }
Line 23: Line 39:
 
Line 27: Line 43:
The {{{ord}}} function above is quite tricky.
Line 28: Line 45:
The {{{ord}}} function above is quite tricky. It can be re-written in several other ways (use that one that will best suite your coding style or your actual needs).  . ''Tricky? Rather, it's using a feature that I can't find documented anywhere -- putting a single quote in front of an integer. Neat effect, but how on '''earth''' did you find out about it? Source diving? -- GreyCat''
  . ''It validates The Single Unix Specification: "If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote." (see [[http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html|printf()]] to know more) -- mjf''
Line 30: Line 48:
 ''Q: Tricky? Rather, it's using a feature that I can't find documented anywhere -- putting a single quote in front of an integer. Neat effect, but how on '''earth''' did you find out about it? Source diving? -- GreyCat''

 ''A: It validates The Single Unix Specification: "If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote." (see [http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html printf()] to know more) -- mjf''
This version of {{{chr}}} executes much faster than the {{{printf}}} version above (about 1/40 to less than 1/150 the time when run in a loop):
Line 35: Line 51:
   ord() {
     printf '%d' \"$1\"
   }
   chr() { echo -en "\0$(( $1 % 8 + 10 * ( $1 / 8 ) + 20 ))"; }
}}}
{{{
   for p in chr newchr; do time for i in {1..4000}; do $p 65 >/dev/null; done; done

   System1 System2
   real 0m46.824s real 1m33.814s
   user 0m4.624s user 0m8.540s
   sys 0m33.290s sys 1m23.978s

   real 0m1.340s real 0m0.512s
   user 0m1.096s user 0m0.389s
   sys 0m0.124s sys 0m0.096s
}}}
  
This version is faster as it executes without a subshell, it seems to only work strictly on ascii chars <127 while the printf version is happy with chars up to 255 and also for only a subset of ascii ie >64 decimal.
Some versions avoiding a subshell:
{{{
oldchr () { printf \\$(printf '%03o' $1) ;}

#posix
chr () {
    set -- $(($1 / 64)) $(($1 % 64))
    set -- $1 $(($2 / 8)) $(($2 % 8))
    printf \\"${1}${2}${3}"
}

#bash only
chr_bash () {
    local temp
    printf -v temp '%03o' $1
    printf \\$temp
}

#test
for i in {1..255} ;do [[ "$(oldchr $i)" = "$(chr $i)" ]] || echo $i;done
for i in {1..255} ;do [[ "$(oldchr $i)" = "$(chr_bash $i)" ]] || echo $i;done
for p in oldchr chr chr_bash; do echo $p:;time for i in {1..4000}; do $p 65 >/dev/null; done; done

}}}
the timings:
{{{
$ bash chr
oldchr:

real 0m14.350s
user 0m5.004s
sys 0m9.248s
chr:

real 0m0.422s
user 0m0.059s
sys 0m0.216s
chr_bash:

real 0m0.400s
user 0m0.042s
sys 0m0.189s
Line 40: Line 112:
Or: Yet another version probably faster:
{{{
chr () {
    printf \\$(($1/64*100+$1%64/8*10+$1%8))
}
Line 42: Line 118:
{{{
   ord() {
     printf '%d' \'$1\'
   }
Line 47: Line 120:

Or, rather:

{{{
   ord() {
     printf '%d' "'$1'"
   }
}}}

Etc. All of the above {{{ord}}} functions should work properly. Which one you choose highly depends on particular situation.

How do I convert an ASCII character to its decimal (or hexadecimal) value and back?

If you have a known octal or hexadecimal value (at script-writing time), you can just use printf:

   # POSIX
   printf '\x27\047\n'

This prints two literal ' characters (27 is the hexadecimal ASCII value of the character, and 47 is the octal value) and a newline.

If you need to convert characters (or numeric ASCII values) that are not known in advance (i.e., in variables), you can use something a little more complicated:

   # POSIX
   # chr() - converts decimal value to its ASCII character representation
   # ord() - converts ASCII character to its decimal value

   chr() {
     printf \\$(printf '%03o' $1)
   }

   ord() {
     printf '%d' "'$1"
   }

   # hex() - converts ASCII character to a hexadecimal value
   # unhex() - converts a hexadecimal value to an ASCII character

   hex() {
      printf '%x' "'$1"
   }

   unhex() {
      printf \\x"$1"
   }

   # examples:

   chr $(ord A)    # -> A
   ord $(chr 65)   # -> 65

The ord function above is quite tricky.

  • Tricky? Rather, it's using a feature that I can't find documented anywhere -- putting a single quote in front of an integer. Neat effect, but how on earth did you find out about it? Source diving? -- GreyCat

    • It validates The Single Unix Specification: "If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote." (see printf() to know more) -- mjf

This version of chr executes much faster than the printf version above (about 1/40 to less than 1/150 the time when run in a loop):

   chr() { echo -en "\0$(( $1 % 8 + 10 * ( $1 / 8 ) + 20 ))"; }

   for p in chr newchr; do time for i in {1..4000}; do $p 65 >/dev/null; done; done

   System1                     System2
   real    0m46.824s           real    1m33.814s
   user    0m4.624s            user    0m8.540s
   sys     0m33.290s           sys     1m23.978s

   real    0m1.340s            real    0m0.512s
   user    0m1.096s            user    0m0.389s
   sys     0m0.124s            sys     0m0.096s

This version is faster as it executes without a subshell, it seems to only work strictly on ascii chars <127 while the printf version is happy with chars up to 255 and also for only a subset of ascii ie >64 decimal. Some versions avoiding a subshell:

oldchr () {  printf \\$(printf '%03o' $1) ;}

#posix
chr () {
    set -- $(($1 / 64)) $(($1 % 64))
    set -- $1  $(($2 / 8)) $(($2 % 8))
    printf \\"${1}${2}${3}"
}

#bash only
chr_bash () {
    local temp
    printf -v temp  '%03o' $1
    printf \\$temp
}

#test
for i in {1..255} ;do [[ "$(oldchr $i)" = "$(chr $i)" ]] || echo $i;done
for i in {1..255} ;do [[ "$(oldchr $i)" = "$(chr_bash $i)" ]] || echo $i;done
for p in oldchr chr chr_bash; do echo $p:;time for i in {1..4000}; do $p 65 >/dev/null; done; done

the timings:

$ bash  chr
oldchr:

real    0m14.350s
user    0m5.004s
sys     0m9.248s
chr:

real    0m0.422s
user    0m0.059s
sys     0m0.216s
chr_bash:

real    0m0.400s
user    0m0.042s
sys     0m0.189s

Yet another version probably faster:

chr () {
    printf \\$(($1/64*100+$1%64/8*10+$1%8))
}

BashFAQ/071 (last edited 2021-02-08 16:03:51 by GreyCat)