Environment Variables

Environment variables are cause of much confusion. People generally think of "the environment" to be a global system-wide pool of settings that processes dip into. This is incorrect.

What is the "environment"

The environment is an area of memory for each and every process. When the process creates a new process (forks), the fork's environment is generally copied from the old process' environment. As a result, the new process has an environment that is identical to, but a copy of, the original process.

What this means to us Bash users, is that whenever we start a new process from our bash scripts, these processes will inherit our script's environment. It's important to know that we're talking about a copy here, and that this copying only ever happens during the creation of the new process.

The same process applies not only to processes started from your Bash scripts, but any processes started on your system: Each has inherited the environment of their parent process; added to, modified or removed stuff from their copy of it, and whatever children they spawned off themselves have inherited that modified environment.

That makes no sense. Illustrate!

The following Bash code may help you visualize things:

# First, let's show what the Bash parameters MY_ENV_VAR and myParameter contain
# in the beginning.
# Assuming the parent of this script didn't have said variables in their environment,
# the script's Bash process will not have them in its environment, and these
# parameters will not yet exist.  Hence, they will expand empty.
# In our output, we'll use the tag "parent" to refer to this script's process,
# and the tag "child" to refer to output generated by a child of this script.
echo "[parent] MY_ENV_VAR: $MY_ENV_VAR"             # Prints: [parent] MY_ENV_VAR: 
echo "[parent] myParameter: $myParameter"           # Prints: [parent] myParameter: 

# Now, let's create some variables.  This command will create a Bash parameter called
# MY_ENV_VAR and one called myParameter.  It will then link MY_ENV_VAR to its environment,
# causing an env var of the same name to be created with matching content.  We will
# not export myParameter, so that one exists purely as a Bash parameter.
MY_ENV_VAR=Hello                                    # Create a parameter MY_ENV_VAR
myParameter=Hello                                   # Create a parameter myParameter
export MY_ENV_VAR                                   # Link MY_ENV_VAR to the environment
echo "[parent] MY_ENV_VAR: $MY_ENV_VAR"             # Prints: [parent] MY_ENV_VAR: Hello
echo "[parent] myParameter: $myParameter"           # Prints: [parent] myParameter: Hello

# Here's how we would update our environment variable.  We basically update the
# Bash parameters.  The parameter that's linked to the environment will cause Bash to
# update its env var of the same name as well.
export MY_ENV_VAR=Bye
echo "[parent] MY_ENV_VAR: $MY_ENV_VAR"             # Prints: [parent] MY_ENV_VAR: Bye
echo "[parent] myParameter: $myParameter"           # Prints: [parent] myParameter: Bye

# Note that these 'echo' statements are expanding our Bash parameters, not
# environment variables.  That's why we see both expansions resulting in 'Bye' even
# though only MY_ENV_VAR is linked to an environment variable.

# Let's demonstrate the effects of the environment now.
# As I said, when we create a new process, we copy our script's environment into the new
# process' environment.  Our script's environment holds only MY_ENV_VAR, not myParameter
# (since we only exported the former).
# Here, we create a new Bash process that runs some bash code which will expand our
# parameters again.
# The new Bash process has noticed its environment contains an environment variable
# named MY_ENV_VAR, so it has created a Bash parameter with the same name.  Since there
# is no myParameter on the environment, it hasn't created this parameter.  It therefore
# expands empty.
bash -c 'echo "[child] MY_ENV_VAR: $MY_ENV_VAR"'    # Prints: [child] MY_ENV_VAR: Bye
bash -c 'echo "[child] myParameter: $myParameter"'  # Prints: [child] myParameter: 

# Our child has only MY_ENV_VAR but back in our script's process, we still have both
# parameters.

# Let's demonstrate that a child Bash process' version of MY_ENV_VAR is not the same
# as our script's version of it.
# Here, we modify MY_ENV_VAR in the child, which will update the child's environment
# variable of the same name.  But when we look at the environment variable in our
# own script's process, we still find the old value:
# The child can update its own environment, but that does not affect the parent's copy.
echo "[parent] MY_ENV_VAR: $MY_ENV_VAR"             # Prints: [parent] MY_ENV_VAR: Bye
bash -c '
    echo "[child] MY_ENV_VAR: $MY_ENV_VAR"          # Prints: [child] MY_ENV_VAR: Bye
    MY_ENV_VAR="Hello again"                        # Update the child's MY_ENV_VAR
    echo "[child] MY_ENV_VAR: $MY_ENV_VAR"          # Prints: [child] MY_ENV_VAR: Hello again
'
echo "[parent] MY_ENV_VAR: $MY_ENV_VAR"             # Prints: [parent] MY_ENV_VAR: Bye

# Let's try a trick now to demonstrate the reverse: Updating MY_ENV_VAR in the parent
# also doesn't update the child's MY_ENV_VAR.
# Here, we'll start a child process that shows its own version of MY_ENV_VAR, waits a
# little while, and then shows it again.  While the child is waiting, we'll update
# MY_ENV_VAR in our main script's copy.
# This change in the main script will not affect our child's second output of MY_ENV_VAR,
# since the child's version of it is a copy at the time of the child's creation and
# any changes to it after it was copied are not carried over.
echo "[parent] MY_ENV_VAR: $MY_ENV_VAR"             # Prints: [parent] MY_ENV_VAR: Bye
bash -c '
    echo "[child] MY_ENV_VAR: $MY_ENV_VAR"          # Prints: [child] MY_ENV_VAR: Bye
    sleep 2                                         # Wait 2 seconds.
    echo "[child] MY_ENV_VAR: $MY_ENV_VAR"          # Prints: [child] MY_ENV_VAR: Bye
' &
sleep 1                                             # Wait only 1 second.
MY_ENV_VAR="Hello again"                            # Update the parent's MY_ENV_VAR
echo "[parent] MY_ENV_VAR: $MY_ENV_VAR"             # Prints: [parent] MY_ENV_VAR: Hello again

# Now, you'll actually see the child's second 'echo' statement happening AFTER the
# parent's second 'echo'.  This is the chronological output:
# [parent] MY_ENV_VAR: Bye
# [child] MY_ENV_VAR: Bye
# [parent] MY_ENV_VAR: Hello again
# [child] MY_ENV_VAR: Bye
# As you can see, having updated the parent's MY_ENV_VAR while the child was waiting
# has not changed the child's version of MY_ENV_VAR.