Diff for "BashFAQ/028"

Differences between revisions 28 and 46 (spanning 18 versions)

How do I determine the location of my script? I want to read some config files from the same place.

There are two prime reasons why this issue comes up: either you want to externalize data or configuration of your script and need a way to find these external resources, or your script is intended to act upon a bundle of some sort (eg. a build script), and needs to find the resources to act upon.

It is important to realize that in the general case, this problem has no solution. Any approach you might have heard of, and any approach that will be detailed below, has flaws and will only work in specific cases. First and foremost, try to avoid the problem entirely by not depending on the location of your script!

Before we dive into solutions, let's clear up some misunderstandings. It is important to understand that:

Your script does not actually have a location! Wherever the bytes end up coming from, there is no "one canonical path" for it. Never.
$0 is NOT the answer to your problem. If you think it is, you can either stop reading and write more bugs, or you can accept this and read on.

I need to access my data/config files

Very often, people want to make their scripts configurable. The separation principle teaches us that it's a good idea to keep configuration and code separate. The problem then ends up being: how does my script know where to find the user's configuration file for it?

Too often, people believe the configuration of a script should reside in the same directory where they put their script. This is the root of the problem.

A UNIX paradigm exists to solve this problem for you: configuration artifacts of your scripts should exist in either the user's HOME directory or /etc. That gives your script an absolute path to look for the file, solving your problem instantly: you no longer depend on the "location" of your script:

   1 if [[ -e ~/.myscript.conf ]]; then
   2     source ~/.myscript.conf
   3 elif [[ -e /etc/myscript.conf ]]; then
   4     source /etc/myscript.conf
   5 fi

The same holds true for other types of data files. Logs should be written to /var/log or the user's home directory. Support files should be installed to an absolute path in the file system or be made available alongside the configuration in /etc or the user's home directory.

I need to access files bundled with my script

Sometimes scripts are part of a "bundle" and perform certain actions within or upon it. This is often true for applications unpacked or contained within a bundle directory. The user may unpack or install the bundle anywhere; ideally, the bundle's scripts should work whether that's somewhere in a home dir, or /var/tmp, or /usr/local. The files are transient, and have no fixed or predictable location.

When a script needs to act upon other files it's bundled with, independently of its absolute location, we have two options: either we rely on PWD or we rely on BASH_SOURCE. Both approaches have certain issues; here's what you need to know.

Using BASH_SOURCE

The BASH_SOURCE internal bash variable is actually an array of pathnames. If you expand it as a simple string, e.g. "$BASH_SOURCE", you'll get the first element, which is the pathname of the currently executing function or script. Using the BASH_SOURCE method, you access files within your bundle like this:

   1 # cd into the bundle and use relative paths
   2 if [[ $BASH_SOURCE = */* ]]; then
   3     cd -- "${BASH_SOURCE%/*}/" || exit
   4 fi
   5 read somevar < etc/somefile

   1 # Use the dirname directly, without changing directories
   2 if [[ $BASH_SOURCE = */* ]]; then
   3     bundledir=${BASH_SOURCE%/*}/
   4 else
   5     bundledir=./
   6 fi
   7 read somevar < "${bundledir}etc/somefile"

Please note that when using BASH_SOURCE, the following caveats apply:

$BASH_SOURCE expands empty when bash does not know where the executing code comes from. Usually, this means the code is coming from standard input (e.g. ssh host 'somecode', or from an interactive session).
$BASH_SOURCE does not follow symlinks (when you run z from /x/y, you get /x/y/z, even if that is a symlink to /p/q/r). Often, this is the desired effect. Sometimes, though, it's not. Imagine your package links its start-up script into /usr/local/bin. Now that script's BASH_SOURCE will lead you into /usr/local and not into the package.

If you're not writing a bash script, the BASH_SOURCE variable is unavailable to you. There is a common convention, however, for passing the location of the script as the process name when it is started. Most shells do this, but not all shells do so reliably, and not all of them attempt to resolve a relative path to an absolute path. Relying on this behaviour is dangerous and fragile, but can be done by looking at $0 (see below). Again, consider all your options before doing this: you are likely creating more problems than you are solving.

Using PWD

Another option is to rely on PWD, the current working directory. In this case, you can assume the user has first cd'ed into your bundle and make all your pathnames relative. Using the PWD method, you access files within your bundle like this:

   1 read somevar < etc/somefile                 # Using pathname relative to PWD
   2 read somevar < "${PWD%/}/etc/somefile"      # Expand PWD if you want an absolute pathname
   3 
   4 bundledir=$PWD                              # Store PWD if you expect to cd in your script.
   5 cd /somewhere/else
   6 read somefile < "${bundledir%/}/etc/somefile"

To reduce fragility, you could even test whether, for example, the relative path to the script name is correct, to make sure the user has indeed cd'ed into the bundle:

   1 if [[ ! -e bin/myscript ]]; then
   2     echo >&2 "Please cd into the bundle before running this script."
   3     exit 1
   4 fi

You can also try some heuristics, just in case the user is sitting one directory above the bundle:

   1 if [[ ! -e bin/myscript ]]; then
   2     if [[ -d mybundle-1.2.5 ]]; then
   3         cd mybundle-1.2.5 || {
   4             echo >&2 "Bundle directory exists but I can't cd there."
   5             exit 1
   6         }
   7     else
   8         echo >&2 "Please cd into the bundle before running this script."
   9         exit 1
  10     fi
  11 fi

If you ever do need an absolute path, you can always get one by prefixing the relative path with $PWD: echo "Saved to: $PWD/result.csv"

The only difficulty here is that you're forcing your user to change into your bundle's directory before your script can function. Regardless, this may well be your best option!

Using a configuration/wrapper

If neither the BASH_SOURCE or the PWD option sound interesting, you might want to consider going the route of configuration files instead (see the previous section). In this case, you require that your user set the location of your bundle in a configuration file, and have him put that configuration file in a location you can easily find. For example:

   1 [[ -e ~/.myscript.conf ]] || {
   2     echo >&2 "First configure the product in ~/.myscript.conf"
   3     exit 1
   4 }
   5 
   6 # ~/.myscript.conf defines something like bundleDir=/x/y
   7 source ~/.myscript.conf
   8 
   9 [[ $bundleDir ]] || {
  10     echo >&2 "Please define bundleDir='/some/path' in ~/.myscript.conf"
  11     exit 1
  12 }
  13 
  14 cd "$bundleDir" || {
  15     echo >&2 "Could not cd to <$bundleDir>"
  16     exit 1
  17 }
  18 
  19 # Now you can use the PWD method: use relative paths.

A variant of this option is to use a wrapper that configures your bundle's location. Instead of calling your bundled script, you install a wrapper for your script in the standard system PATH, which changes directory into the bundle and calls the real script from there, which can then safely use the PWD method from above:

   1 #!/usr/bin/env bash
   2 cd /path/to/where/bundle/was/installed
   3 exec "bin/realscript"

Why $0 is NOT an option

Common ways of finding a script's location depend on the name of the script, as seen in the predefined variable $0. Unfortunately, providing the script name via $0 is only a (common) convention, not a requirement. In fact, $0 is not at all the location of your script, it's the name of your process as determined by your parent. It can be anything.

The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". But this isn't reliable across shells; some of them (including BASH) return the actual command typed in by the user instead of the fully qualified path. And this is just the tip of the iceberg!

Consider that your script may not actually be on a locally accessible disk at all. Consider this:

ssh remotehost bash < ./myscript

The shell running on remotehost is getting its commands from a pipe. There's no script anywhere on any disk that bash can see.

Moreover, even if your script is stored on a local disk and executed, it could move. Someone could mv the script to another location in between the time you type the command and the time your script checks $0. Or someone could have unlinked the script during that same time window, so that it doesn't actually have a link within a file system any more.

(That may sound fanciful, but it's actually very common. Consider a script installed in /opt/foobar/bin, which is running at the time someone upgrades foobar to a new version. They may delete the entire /opt/foobar/ hierarchy, or they may move the /opt/foobar/bin/foobar script to a temporary name before putting a new version in place. For these reasons, even approaches like "use lsof to find the file which the shell is using as standard input" will still fail.)

Even in the cases where the script is in a fixed location on a local disk, the $0 approach still has some major drawbacks. The most important is that the script name (as seen in $0) may not be relative to the current working directory, but relative to a directory from the program search path $PATH (this is often seen with KornShell). Or (and this is most likely problem by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common PATH directory like /usr/local/bin, which is how it's being invoked. Your script might be in /opt/foobar/bin/script but the naive approach of reading $0 won't tell you that -- it may say /usr/local/bin/script instead.

Some people will try to work around the symlink issue with readlink -f "$0". Again, this may work in some cases, but it's not bulletproof. Nothing that reads $0 will ever be bulletproof, because $0 itself is unreliable. Furthermore, readlink is nonstandard, and won't be available on all platforms.

For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see this Plan 9 paper.

CategoryShell

-  ⇤ ← Revision 28 as of 2013-07-04 14:00:26 → 
  Size: 9465
  Editor: Lhunath
  Comment:
+   ← Revision 46 as of 2022-02-16 22:50:37 → ⇥
  Size: 11532
  Editor: larryv
  Comment: add anchors for easier linking
-Deletions are marked like this.
+Additions are marked like this.
 Line 10:
- * {{{$0}}} is NOT the answer to your problem.  If you think it is, you can either stop reading and write more bugs, or you can accept this and read on.
+ * `$0` is NOT the answer to your problem.  If you think it is, you can either stop reading and write more bugs, or you can accept this and read on.
 Line 13:
-=== Accessing data/config files ===
+<<Anchor(config)>>
=== I need to access my data/config files ===
-Line 19:
+Line 20:
-A UNIX paradigm exists to solve this problem for you: configuration artifacts of your scripts should exist in either the user's {{{HOME}}} directory or {{{/etc}}}.  That gives your script an absolute path to look for the file, solving your problem instantly: you no longer depend on the "location" of your script:
+''A UNIX paradigm exists'' to solve this problem for you: '''configuration artifacts of your scripts should exist in either the user's `HOME` directory or `/etc`'''.  That gives your script an absolute path to look for the file, solving your problem instantly: you no longer depend on the "location" of your script:
-Line 21:
+Line 22:
-{{{
    if [[ -e ~/.myscript.conf ]]; then source ~/.myscript.conf
    elif [[ -e /etc/myscript.conf ]]; then source /etc/myscript.conf
    fi
+{{{#!highlight bash
if [[ -e ~/.myscript.conf ]]; then
    source ~/.myscript.conf
elif [[ -e /etc/myscript.conf ]]; then
    source /etc/myscript.conf
fi
-Line 27:
+Line 30:
+The same holds true for other types of data files.  Logs should be written to `/var/log` or the user's home directory.  Support files should be installed to an absolute path in the file system or be made available alongside the configuration in `/etc` or the user's home directory.
-Line 28:
+Line 32:
-=== Acting on a bundle ===
-Line 30:
+Line 33:
-Sometimes scripts are part of a "bundle" and perform certain actions within or upon it.  The user may unpack the bundle anywhere; ideally, the bundle's scripts should work whether that's somewhere in a home dir, or {{{/var/tmp}}}, or {{{/usr/local}}}.  The files are transient, and have no fixed or predictable location.
+<<Anchor(bundle)>>
=== I need to access files bundled with my script ===
-Line 32:
+Line 36:
-When a script needs to act upon other files it's bundled with, independently of its absolute location, we have two options: either we rely on {{{PWD}}} or we rely on {{{BASH_SOURCE}}}.  Both approaches have certain issues; here's what you need to know.
+Sometimes scripts are part of a "bundle" and perform certain actions within or upon it.  This is often true for applications unpacked or contained within a bundle directory.  The user may unpack or install the bundle anywhere; ideally, the bundle's scripts should work whether that's somewhere in a home dir, or `/var/tmp`, or `/usr/local`.  The files are transient, and have no fixed or predictable location.
-Line 34:
+Line 38:
-==== BASH_SOURCE ====
+When a script needs to act upon other files it's bundled with, independently of its absolute location, we have two options: '''either we rely on `PWD` or we rely on `BASH_SOURCE`'''.  Both approaches have certain issues; here's what you need to know.
-Line 36:
+Line 40:
-The {{{BASH_SOURCE}}} internal bash variable is actually an array of pathnames.  If you expand it as a simple string, e.g. {{{"$BASH_SOURCE"}}}, you'll get the first element, which is the pathname of the currently executing function or script.  The following caveats apply:
+<<Anchor(bash_source)>>
==== Using BASH_SOURCE ====
-Line 38:
+Line 43:
- * {{{$BASH_SOURCE}}} expands ''empty'' when bash does not know where the executing code comes from.  Usually, this means the code is coming from ''standard input'' (e.g. ssh host 'somecode', or from an interactive session).
 * {{{$BASH_SOURCE}}} does ''not follow'' symlinks (when you run {{{z}}} from {{{/x/y}}}, you get {{{/x/y/z}}}, even if that is a symlink to {{{/p/q/r}}}).  Often, this is the desired effect.  Sometimes, though, it's not.  Imagine your package links its start-up script into {{{/usr/local/bin}}}.  Now that script's {{{BASH_SOURCE}}} will lead you into {{{/usr/local}}} and not into the package.
+The `BASH_SOURCE` internal bash variable is actually an array of pathnames.  If you expand it as a simple string, e.g. '''`"$BASH_SOURCE"`''', you'll get the first element, which '''is the pathname of the currently executing function or script'''.  Using the `BASH_SOURCE` method, you access files within your bundle like this:
-Line 41:
+Line 45:
-If you're not writing a bash script, the {{{BASH_SOURCE}}} variable is unavailable to you.  There is a common convention, however, for passing the location of the script as the process name when it is started.  Most shells do this, but not all shells do so reliably, and not all of them attempt to resolve a relative path to an absolute path.  Relying on this behaviour is dangerous and fragile, but can be done by looking at {{{$0}}}.  Again, consider all your options before doing this: you are likely creating more problems than you are solving.
+{{{#!highlight bash
# cd into the bundle and use relative paths
if [[ $BASH_SOURCE = */* ]]; then
    cd -- "${BASH_SOURCE%/*}/" || exit
fi
read somevar < etc/somefile
}}}
{{{#!highlight bash
# Use the dirname directly, without changing directories
if [[ $BASH_SOURCE = */* ]]; then
    bundledir=${BASH_SOURCE%/*}/
else
    bundledir=./
fi
read somevar < "${bundledir}etc/somefile"
}}}
-Line 43:
+Line 62:
-==== PWD ====
+Please note that when using `BASH_SOURCE`, the ''following caveats'' apply:
-Line 45:
+Line 64:
-Another option is to rely on {{{PWD}}}, the current working directly.  In this case, you can assume the user has first {{{cd}}}'ed into your bundle and make all your pathnames relative.  To reduce fragility, you could even test whether, for example, the relative path to the script name is correct, to make sure the user has indeed {{{cd}}}'ed into the bundle:
+ * `$BASH_SOURCE` expands ''empty'' when bash does not know where the executing code comes from.  Usually, this means the code is coming from ''standard input'' (e.g. ssh host 'somecode', or from an interactive session).
 * `$BASH_SOURCE` does ''not follow'' symlinks (when you run `z` from `/x/y`, you get `/x/y/z`, even if that is a symlink to `/p/q/r`).  Often, this is the desired effect.  Sometimes, though, it's not.  Imagine your package links its start-up script into `/usr/local/bin`.  Now that script's `BASH_SOURCE` will lead you into `/usr/local` and not into the package.
-Line 47:
+Line 67:
-{{{
    [[ -e bin/myscript ]] || { echo >&2 "Please cd into the bundle before running this script."; exit 1; }
+If you're not writing a bash script, the `BASH_SOURCE` variable is unavailable to you.  There is a common convention, however, for passing the location of the script as the process name when it is started.  Most shells do this, but not all shells do so reliably, and not all of them attempt to resolve a relative path to an absolute path.  Relying on this behaviour is dangerous and fragile, but can be done by looking at `$0` ([[#Why $0 is NOT an option|see below]]).  Again, consider all your options before doing this: you are likely creating more problems than you are solving.

<<Anchor(pwd)>>
==== Using PWD ====

Another option is to rely on `PWD`, the current working directory.  In this case, you can '''assume the user has first `cd`'ed into your bundle and make all your pathnames relative'''.  Using the PWD method, you access files within your bundle like this:

{{{#!highlight bash
read somevar < etc/somefile                 # Using pathname relative to PWD
read somevar < "${PWD%/}/etc/somefile"      # Expand PWD if you want an absolute pathname

bundledir=$PWD                              # Store PWD if you expect to cd in your script.
cd /somewhere/else
read somefile < "${bundledir%/}/etc/somefile"
}}}

To reduce fragility, you could even test whether, for example, the relative path to the script name is correct, to make sure the user has indeed `cd`'ed into the bundle:

{{{#!highlight bash
if [[ ! -e bin/myscript ]]; then
    echo >&2 "Please cd into the bundle before running this script."
    exit 1
fi
-Line 53:
+Line 94:
-{{{
    if [[ ! -e bin/myscript ]]; then
        if [[ -d mybundle-1.2.5 ]]; then
            cd mybundle-1.2.5 || { echo >&2 "Bundle directory exists but I can't cd there."; exit 1; }
        else
            echo >&2 "Please cd into the bundle before running this script."; exit 1;
        fi
+{{{#!highlight bash
if [[ ! -e bin/myscript ]]; then
    if [[ -d mybundle-1.2.5 ]]; then
        cd mybundle-1.2.5 || {
            echo >&2 "Bundle directory exists but I can't cd there."
            exit 1
        }
    else
        echo >&2 "Please cd into the bundle before running this script."
        exit 1
-Line 61:
+Line 105:
+fi
-Line 63:
+Line 108:
-If you ever do need an absolute path, you can always get one by prefixing the relative path with {{{$PWD}}}: {{{echo "Saved to: $PWD/result.csv"}}}
+If you ever do need an absolute path, you can always get one by prefixing the relative path with `$PWD`: `echo "Saved to: $PWD/result.csv"`
-Line 67:
+Line 112:
-==== Config ====
+==== Using a configuration/wrapper ====
-Line 69:
+Line 114:
-If neither the {{{BASH_SOURCE}}} or the {{{PWD}}} option sound interesting, you might want to consider going the route of configuration files instead (see the previous section).  In this case, you require that your user set the location of your bundle in a configuration file, and have him put that configuration file in a location you can easily find.  For example:
+If neither the `BASH_SOURCE` or the `PWD` option sound interesting, you might want to consider going the route of configuration files instead (see the previous section).  In this case, you require that your user set the location of your bundle in a configuration file, and have him put that configuration file in a location you can easily find.  For example:
-Line 71:
+Line 116:
-{{{
    [[ -e ~/.myscript.conf ]] || { echo >&2 "First configure the product in ~/.myscript.conf"; exit 1; }
    source ~/.myscript.conf      # ~/.myscript.conf defines something like bundleDir=/x/y
    [[ ! $bundleDir ]]        || { echo >&2 "Please define bundleDir='/some/path' in ~/.myscript.conf"; exit 1; }
    cd "$bundleDir"           || { echo >&2 "Could not cd to <$bundleDir>"; exit 1; }
+{{{#!highlight bash
[[ -e ~/.myscript.conf ]] || {
    echo >&2 "First configure the product in ~/.myscript.conf"
    exit 1
}
-Line 77:
+Line 122:
-    # Now you can use the PWD method: use relative paths.
+# ~/.myscript.conf defines something like bundleDir=/x/y
source ~/.myscript.conf

[[ $bundleDir ]] || {
    echo >&2 "Please define bundleDir='/some/path' in ~/.myscript.conf"
    exit 1
}

cd "$bundleDir" || {
    echo >&2 "Could not cd to <$bundleDir>"
    exit 1
}

# Now you can use the PWD method: use relative paths.
-Line 80:
+Line 138:
-=== Why you can't just use $0 ===
+<<Anchor(wrapper)>>
A variant of this option is to use a wrapper that configures your bundle's location.  Instead of calling your bundled script, you install a wrapper for your script in the standard system `PATH`, which changes directory into the bundle and calls the real script from there, which can then safely use the `PWD` method from above:
-Line 82:
+Line 141:
-It's '''not possible''' to find the location of a script reliably in all cases. Common ways of finding a script's location depend on the name of the script, as seen in the predefined variable {{{$0}}}. But providing the script name in {{{$0}}} is only a (very common) convention, not a requirement.
+{{{#!highlight bash
#!/usr/bin/env bash
cd /path/to/where/bundle/was/installed
exec "bin/realscript"
}}}

<<Anchor(arg0)>>
=== Why $0 is NOT an option ===

Common ways of finding a script's location depend on the name of the script, as seen in the predefined variable `$0`. Unfortunately, providing the script name via `$0` is only a (common) convention, not a requirement.  In fact, `$0` is not at all the location of your script, it's the '''name''' of your process as determined by your parent.  It can be ''anything''.
-Line 86:
+Line 154:
-Your script may not actually be on a locally accessible disk ''at all''.  Consider this:
+Consider that your script may not actually be on a locally accessible disk ''at all''.  Consider this:
-Line 89:
+Line 157:
-  ssh remotehost bash < ./myscript
+ssh remotehost bash < ./myscript
-Line 91:
+Line 159:
-The shell running on remotehost is getting its commands from a pipe.  There's no script ''anywhere'' on any disk that {{{bash}}} can see.
+The shell running on remotehost is getting its commands from a pipe.  There's no script ''anywhere'' on any disk that `bash` can see.
-Line 93:
+Line 161:
-Moreover, even if your script is stored on a local disk and executed, it could ''move''.  Someone could {{{mv}}} the script to another location in between the time you type the command and the time your script checks {{{$0}}}.  Or someone could have unlinked the script during that same time window, so that it doesn't actually have a link within a file system any more.
+Moreover, even if your script is stored on a local disk and executed, it could ''move''.  Someone could `mv` the script to another location in between the time you type the command and the time your script checks `$0`.  Or someone could have unlinked the script during that same time window, so that it doesn't actually have a link within a file system any more.
-Line 97:
+Line 165:
-Even in the cases where the script is in a fixed location on a local disk, the {{{$0}}} approach still has some major drawbacks. The most important is that the script name (as seen in {{{$0}}}) may not be relative to the current working directory, but relative to a directory from the program search path {{{$PATH}}} (this is often seen with KornShell).  Or (and this is most likely problem by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common {{{PATH}}} directory like {{{/usr/local/bin}}}, which is how it's being invoked.  Your script might be in {{{/opt/foobar/bin/script}}} but the naive approach of reading {{{$0}}} won't tell you that -- it may say {{{/usr/local/bin/script}}} instead.
+Even in the cases where the script is in a fixed location on a local disk, the `$0` approach still has some major drawbacks. The most important is that the script name (as seen in `$0`) may not be relative to the current working directory, but relative to a directory from the program search path `$PATH` (this is often seen with KornShell).  Or (and this is most likely problem by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common `PATH` directory like `/usr/local/bin`, which is how it's being invoked.  Your script might be in `/opt/foobar/bin/script` but the naive approach of reading `$0` won't tell you that -- it may say `/usr/local/bin/script` instead.
-Line 101:
+Line 169:
-For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [[http://www.cs.bell-labs.com/sys/doc/lexnames.html|this Plan 9 paper]].
+For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see [[http://doc.cat-v.org/plan_9/4th_edition/papers/lexnames|this Plan 9 paper]].