Diff for "BashFAQ/028"

Differences between revisions 4 and 23 (spanning 19 versions)

How do I determine the location of my script? I want to read some config files from the same place.

There are two prime reasons why this issue comes up: you either want to externalize data or configuration of your script and need a way to find these external resources, or your script is intended to act upon a bundle of some sort (eg. a build script), and needs to find the resources to act upon.

It is important to realize that in the general case, this problem has no solution. Any approach you might have heard of, and any approach that will be detailed below has flaws and will only work for their specific cases, so pay attention but first and foremost, try to avoid the problem entirely by not depending on the location of your script!

Before we dive into solutions, let's clear up some misunderstandings. It is important to understand that:

- Your script does not actually have a location! Wherever the bytes end up coming from, there is no "one canonical path" for it. Never. - $0 is NOT the answer to your problem. If you think it is, you can either stop reading and write more bugs, or you can accept this and read on.

Accessing data/config files

Very often, people want to make their scripts configurable. And the separation principle teaches us that it's a good idea to keep configuration and code separate. The problem then ends up being: how does my script know where to find the user's configuration file for it?

Too often, people believe the configuration of a script should reside in the same directory as where they put their script. This is the root of the problem.

Interestingly, a UNIX paradigm exists to solve this problem for you: Configuration artifacts of your scripts should exist in either the user's directory or . That gives your script an absolute path to look for the file, solving your problem instantly: you no longer depend on the "location" of your script:

if -e ~/.myscript.conf; then source ~/.myscript.conf elif -e /etc/myscript.conf; then source /etc/myscript.conf fi

Acting on a bundle

More common yet, scripts are part of a bundle and perform certain actions within or upon it. Ideally, it is desired that the bundle works independently of where the user has unpacked it; whether that's somewhere in their home dir, in /var or in /usr/local.

When our script needs to act upon other files it's bundled with, independently of its absolute location, we have two options: Either we rely on or we rely on . Both approaches have certain issues, here's what you need to know.

The internal bash variable is actually an array or pathnames. If you however expand it as a simple string, eg. , you'll get the first element, which is the pathname of the currently executing function or script. The following caveats apply:

- expands empty when bash does not know where the executing code comes from. Usually, this means the code is coming from standard input (eg. ssh host 'somecode', or from an interactive session). - does not follow symlinks (when you run from , you get , even if that is a symlink to ). Often, this is the desired effect. Sometimes, though, it's not. Imagine your bundle links its start-up script into , now that script's will lead you into and not into the bundle.

Another option is to rely on , the current working directly. In this case, you can assume the user has first 'ed into your bundle and make all your pathnames relative. To reduce fragility, you could even test whether, for example, the relative path to the script name is correct, to make sure the user has indeed 'ed into the bundle:

-e bin/myscript || { echo >&2 "Please cd into the bundle before running this script."; exit 1; }

If you ever do need an absolute path, you can always get one by prefixing the relative path with : echo "Saved to: $PWD/result.csv"

The only difficulty here is that you're now forcing your user to first change into your bundle's directory before your script can function. Regardless, this may well be your best option!

If neither the or the option sound interesting, you might want to consider going the route of configuration files instead (see the previous section). In this case, you require that your user configure the location of your bundle in a configuration file and have him put that configuration file in a location you can easily find. For example:

-e ~/.myscript.conf || { echo >&2 "First configure the product in ~/.myscript.conf"; exit 1; } source ~/.myscript.conf # ~/.myscript.conf defines something like bundleDir=/x/y cd "$bundleDir" # Now you can use the PWD method: use relative paths.

Why is it so hard to find my script's location?

This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. Common ways of finding a script's location depend on the name of the script, as seen in the predefined variable $0 (don't do this!). But providing the script name in $0 is only a (very common) convention, not a requirement.

The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". But this isn't reliable across shells; some of them (including BASH) return the actual command typed in by the user instead of the fully qualified path. And this is just the tip of the iceberg!

Your script may not actually be on a locally accessible disk at all. Consider this:

  ssh remotehost bash < ./myscript

The shell running on remotehost is getting its commands from a pipe. There's no script anywhere on any disk that bash can see.

Moreover, even if your script is stored on a local disk and executed, it could move. Someone could mv the script to another location in between the time you type the command and the time your script checks $0. Or someone could have unlinked the script during that same time window, so that it doesn't actually have a link within a file system any more.

Even in the cases where the script is in a fixed location on a local disk, the $0 approach still has some major drawbacks. The most important is that the script name (as seen in $0) may not be relative to the current working directory, but relative to a directory from the program search path $PATH (this is often seen with KornShell). Or (and this is most likely problem by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common PATH directory like /usr/local/bin, which is how it's being invoked. Your script might be in /opt/foobar/bin/script but the naive approach of reading $0 won't tell you that -- it may say /usr/local/bin/script instead.

(For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see this Plan 9 paper.)

Non-bash solutions

  case $0 in /*) echo "$0";; *) echo "`pwd`/$0";; esac

Or a shell-independent variant (needs a readlink(1) supporting -f, though, so it's OS-dependent):

  readlink -f "$0"

CategoryShell

-  ⇤ ← Revision 4 as of 2007-10-14 11:12:34 → 
  Size: 6927
  Editor: p549571AF
  Comment: Cosmetic fixes to script.
+   ← Revision 23 as of 2013-07-03 16:39:54 → ⇥
  Size: 7761
  Editor: Lhunath
  Comment: This article really needed to be simplified and oriented toward solutions rather than problems.  Also, stop recommending $0! Bash's BASH_SOURCE is superior in every way, even if it's not perfect.
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-[[Anchor(faq28)]]
+<<Anchor(faq28)>>
 Line 3:
-This topic comes up frequently.  This answer covers not only the expression used above ("configuration files"), but also several variant situations.  If you've been directed here, please read this entire answer before dismissing it.
-Line 5:
+Line 4:
-This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. All ways of finding a script's location depend on the name of the script, as seen in the predefined variable {{{$0}}}. But providing the script name in {{{$0}}} is only a (very common) convention, not a requirement.
+There are two prime reasons why this issue comes up: you either want to externalize data or configuration of your script and need a way to find these external resources, or your script is intended to act upon a bundle of some sort (eg. a build script), and needs to find the resources to act upon.
-Line 7:
+Line 6:
-The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". But this isn't reliable across shells; some of them (including ["BASH"]) return the actual command typed in by the user instead of the fully qualified path.  And this is just the tip of the iceberg!
+It is important to realize that in the general case, this problem has no solution.  Any approach you might have heard of, and any approach that will be detailed below has flaws and will only work for their specific cases, so pay attention but first and foremost, try to avoid the problem entirely by not depending on the location of your script!

Before we dive into solutions, let's clear up some misunderstandings.  It is important to understand that:

   - Your script does '''not''' actually have a location!  Wherever the bytes end up coming from, there is no "one canonical path" for it.  Never.
   - {{{$0}}} is NOT the answer to your problem.  If you think it is, you can either stop reading and write more bugs, or you can accept this and read on.


=== Accessing data/config files ===

Very often, people want to make their scripts configurable.  And the separation principle teaches us that it's a good idea to keep configuration and code separate.  The problem then ends up being: how does my script know where to find the user's configuration file for it?

Too often, people believe the configuration of a script should reside in the same directory as where they put their script.  This is the root of the problem.

Interestingly, a UNIX paradigm exists to solve this problem for you:  Configuration artifacts of your scripts should exist in either the user's {{HOME}} directory or {{/etc}}.  That gives your script an absolute path to look for the file, solving your problem instantly: you no longer depend on the "location" of your script:

    if [[ -e ~/.myscript.conf ]]; then source ~/.myscript.conf
    elif [[ -e /etc/myscript.conf ]]; then source /etc/myscript.conf
    fi


=== Acting on a bundle ===

More common yet, scripts are part of a bundle and perform certain actions within or upon it.  Ideally, it is desired that the bundle works independently of where the user has unpacked it; whether that's somewhere in their home dir, in /var or in /usr/local.

When our script needs to act upon other files it's bundled with, independently of its absolute location, we have two options:  Either we rely on {{PWD}} or we rely on {{BASH_SOURCE}}.  Both approaches have certain issues, here's what you need to know.

The {{BASH_SOURCE}} internal bash variable is actually an array or pathnames.  If you however expand it as a simple string, eg. {{"$BASH_SOURCE"}}, you'll get the first element, which is the pathname of the currently executing function or script.  The following caveats apply:

   - {{BASH_SOURCE}} expands ''empty'' when bash does not know where the executing code comes from.  Usually, this means the code is coming from ''standard input'' (eg. ssh host 'somecode', or from an interactive session).
   - {{BASH_SOURCE}} does ''not follow'' symlinks (when you run {{z}} from {{/x/y}}, you get {{/x/y/z}}, even if that is a symlink to {{/p/q/r}}).  Often, this is the desired effect.  Sometimes, though, it's not.  Imagine your bundle links its start-up script into {{/usr/local/bin}}, now that script's {{BASH_SOURCE}} will lead you into {{/usr/local}} and not into the bundle.

Another option is to rely on {{PWD}}, the current working directly.  In this case, you can assume the user has first {{cd}}'ed into your bundle and make all your pathnames relative.  To reduce fragility, you could even test whether, for example, the relative path to the script name is correct, to make sure the user has indeed {{cd}}'ed into the bundle:

    [[ -e bin/myscript ]] || { echo >&2 "Please cd into the bundle before running this script."; exit 1; }

If you ever do need an absolute path, you can always get one by prefixing the relative path with {{PWD}}: {{{echo "Saved to: $PWD/result.csv"}}}

The only difficulty here is that you're now forcing your user to first change into your bundle's directory before your script can function.  Regardless, this may well be your best option!

If neither the {{BASH_SOURCE}} or the {{PWD}} option sound interesting, you might want to consider going the route of configuration files instead (see the previous section).  In this case, you require that your user configure the location of your bundle in a configuration file and have him put that configuration file in a location you can easily find.  For example:

    [[ -e ~/.myscript.conf ]] || { echo >&2 "First configure the product in ~/.myscript.conf"; exit 1; }
    source ~/.myscript.conf      # ~/.myscript.conf defines something like bundleDir=/x/y
    cd "$bundleDir"              # Now you can use the PWD method: use relative paths.


=== Why is it so hard to find my script's location? ===

This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. Common ways of finding a script's location depend on the name of the script, as seen in the predefined variable {{{$0}}} ('''don't do this!'''). But providing the script name in {{{$0}}} is only a (very common) convention, not a requirement.

The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". But this isn't reliable across shells; some of them (including [[BASH]]) return the actual command typed in by the user instead of the fully qualified path.  And this is just the tip of the iceberg!
-Line 14:
+Line 64:
-Line 21:
+Line 70:
-(For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [http://www.cs.bell-labs.com/sys/doc/lexnames.html this Plan 9 paper].)
+(For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [[http://www.cs.bell-labs.com/sys/doc/lexnames.html|this Plan 9 paper]].)
-Line 23:
+Line 72:
-Having said all that, if you ''still'' want to make a whole slew of naive assumptions, and all you want is the fully qualified version of $0, you can use something like this (["POSIX"], non-Bourne):
+=== Non-bash solutions ===
-Line 26:
+Line 75:
-  [[ $0 = /* ]] && echo $0 || echo $PWD/$0
}}}

Or the BourneShell version:

{{{
  case $0 in /*) echo $0;; *) echo `pwd`/$0;; esac
+  case $0 in /*) echo "$0";; *) echo "`pwd`/$0";; esac
-Line 41:
+Line 84:
-If we want to account for the cases where the script's relative pathname (in {{{$0}}}) may be relative to a {{{$PATH}}} component instead of the current working directory (as mentioned above), we can still try to search the script like the shell would have done: in all directories from {{{$PATH}}}.

The following script shows how this could be done:

{{{
#!/bin/bash

myname=$0
if [ -s "$myname" ] && [ -x "$myname" ]; then
    # $myname is already a valid file name

    mypath=$myname
else
    case "$myname" in
    /*) exit 1;;             # absolute path - do not search PATH
    *)
        # Search all directories from the PATH variable. Take
        # care to interpret leading and trailing ":" as meaning
        # the current directory; the same is true for "::" within
        # the PATH.
    
        # Replace leading : with . in PATH, store in p
        p=${PATH/#:/.:}
        # Replace trailing : with .
        p=${p//%:/:.}
        # Replace :: with .
        p=${p//::/:.:}
        # Temporary input field separator
        OFS=$IFS; IFS=$'\n'
        # Split the path in newlines and loop through each of them
        for dir in ${p//:/$'\n'}; do
                [ -f "$dir/$myname" ] || continue # no file
                [ -x "$dir/$myname" ] || continue # not executable
                mypath=$dir/$myname
                break           # only return first matching file
        done
        IFS=$OFS
        ;;
    esac
fi

if [ ! -f "$mypath" ]; then
    echo >&2 "cannot find full path name: $myname"
    exit 1
fi

echo >&2 "path of this script: $mypath"
}}}

Note that {{{$mypath}}} is not necessarily an absolute path name. It still can contain relative parts like {{{../bin/myscript}}}.

Are you starting to see how ridiculously complex this problem is becoming?  And this is ''still'' just the simplistic case where we've made a lot of assumptions about the script not moving and not being piped in!

Generally, storing data files in the same directory as their programs is a bad practise. The Unix file system layout assumes that files in one place (e.g. {{{/bin}}}) are executable programs, while files in another place (e.g. {{{/etc}}}) are data files.  (Let's ignore legacy Unix systems with programs in {{{/etc}}} for the moment, shall we....)

Here are some common sense alternatives you should consider, instead of attempting to perform the impossible:
 * It really makes the most sense to keep your script's configuration in a single, static location such as {{{/etc/foobar.conf}}}.
 * If you need to define multiple configuration files, then you can have a directory (say, {{{/var/lib/foobar/}}} or {{{/usr/local/lib/foobar/}}}), and read that directory's location from a fixed place such as {{{/etc/foobar.conf}}}.
 * If you don't even want that much to be hard-coded, you could pass the location of {{{foobar.conf}}} (or of your configuration directory itself) as a parameter to the script.
 * If you need the script to assume certain default in the absence of {{{/etc/foobar.conf}}}, you can put defaults in the script itself, or fall back to something like {{{$HOME/.foobar.conf}}} if {{{/etc/foobar.conf}}} is missing.
 * When you install the script on a target system, you could put the script's location into a variable in the script itself.  The information is available at that point, and as long as the script doesn't move, it will always remain correct for each installed system.
 * In most cases, it makes more sense to abort gracefully if your configuration data can't be found by obvious means, rather than going through arcane processes and possibly coming up with wrong answers.
+----
CategoryShell