Variables and Environment Variables In Bash
Most programming languages make heavy use of ordinary variables. They also (generally) support using environment variables.
Shells like bash - also heavily used for writing shell scripts -
have a slightly different audience in mind than normal programming
languages. They make heavy use of environment variables. So much so
that it can sometimes become blurry whether something is a normal
variable or an environment variable.
In this post, I will dig into variables and environment variables in
shell, particularly bash, to help you understand how to distinguish
the two and when to use them.
I am going to assume you have a little bit of experience with bash
and understand what a basic shell command like cat file.txt does.
Lets get started.
Creating and using variables
Like pretty much any other programming language, shells support
variables. A variable is basically a value that has a name. It is
local to the running program. Variables can be local to bash
functions or global to the entire bash script.
Defining a variable is easy in bash:
variable=value
Leaving out the value defines a variable with a blank value:
variable=
The spacing is very important in defining and setting the values of
variable. bash scripts are both scripts but also valid commands for
the shell. Adding a space will result in an error or do something that
you might not expect at first:
# this runs the program `var` with the argument `=value`
var =value
# this runs the program `var` with 2 arguments: `=` and `value`
var = value
# this will make more sense later, but it runs the program `value`!
var= value
To access a variable, use a dollar sign ($). echo is great for
seeing the result of the variable:
variable=value
echo $variable
You can mix variables and your own text freely:
name=John
echo Hello $name
You can use ${ and } to specify where a variable name starts and
ends. For example, lets say you want to show the file readme_2020.txt:
filename=readme
cat ${filename}_2020.txt
The underscore (_) is a valid part of a variable name. If you were
to write this as cat $filename_2020.txt, bash would understand
this as a single variable named filename_2020. It would treat it as
if you had written cat ${filename_2020}.txt.
Using the ${name} notation also lets us perform a number of actions,
such as doing a regular expression replace. The Bash Hackers
Wiki
has a great reference on what
these difference types of these actions
are
The variables you have seen till now have been simple pieces of text. Both the name and the value were simple text strings. Each variable had a single value.
bash supports additional types of variables, such as array and
integers.
bash lets us define arrays using the array notation.
array=(value1 value2)
However, for some other types of variables, such as integers and
read-only (also known as constant) variables, we have to declare the
type first.
Using declare to customize variables
We can also customize the type of a variable using the declare built
in. The syntax is declare [OPTION] variable_name[=value], where both
the option and the value are optional.
# declare a normal variable
declare variable
# declare an array named days_of_week
declare -a days_of_week
# declare that the variable ten has the read-only value 10
declare -r ten=10
However, declare is often optional. You can just define the value
and declare is only needed if the value would normally be understood
as something different.
For example, you can just create an array using the array notation without having to declare it.
days_of_week=(Monday Tuesday Wednesday Thursday Friday Saturday Sunday)
echo The first day of the week is ${days_of_week[0]}
Using special and built-in variables
bash also has several built-in and special variables. They allow us to
do a few cool things like access the arguments of a running shell
script, the process id of the current script, and access the exit code
of the last command we used. Here are some examples:
echo The name of this script is $0
echo The process id of this script is $$
The Bash Hackers Wiki has a great guide covering all the special and built in variables
Understanding, creating and using environment variables
Environment variables are a way to specify the environment in which a program is running. They are useful for providing a way to customize how a process behaves. For example, The 12 Factor App guidelines require that all configuration for a program to be specified via environment variable.
Environment variable are simple key and values containing some data. It is not possible to have the more complex data types that are available for normal variables.
Unlike variables which are a part of the currently running shell script (or program), the environment is part of the process tree. Each process has its own copy of the environment.
There are some basic properties that define how the environment (and environment variables) behaves. These properties apply to all processes, including programs, shell scripts and interactive shell sessions:
-
Each process starts with an environment that is a copy of its parent process’s environment, by default.
-
A parent process can specify and customize the environment of a process it is spawning.
-
Each process can modify its own environment.
That’s about it. There are several implications of these rules:
-
No process can modify the environment of any other process after that other process has been created.
-
The only way a process can modify the environment of another process is before creating that other process itself.
-
A program created by the shell can not modify the environment of the shell.
To make an environment variable in bash you have to use the export
builtin:
export ENV_VAR
That creates and defines the environment variable ENV_VAR.
If you have paid attention to the properties above, you might have
realized to figure out that export can’t be a separate program that
the shell spawns and runs. If it was, export couldn’t modify the
environment variables of the process that created it.
export is actually a built-in in shells like bash. bash itself
understands the meaning of export and modifies its own environment
variables.
It’s common convention to use lowercase for normal variables and uppercase for environment variables. But it’s not required. You can use uppercase for variables and lowercase for environment variables.
You can give an environment variable a value after defining it:
export ENV_VAR
ENV_VAR=value
You can also create and assign a value at the same time:
export ENV_VAR=value
The only difference in how normal variables and environment variables
are defined is the presence of the export keyword.
Environment variables are used exactly like normal variables: via
$NAME or ${NAME}. There’s no difference between them. The only way
to find out if a variable is a normal variable or an environment
variable is by checking how it was defined (preferred) or through
specialized command like env (not a great idea).
Modifying a variable (if it’s not read-only) or environment variable doesn’t change it’s type.
Re-exporting an environment variable has no impact.
# PATH and $HOME are common and well known environment variable
export PATH=$PATH:$HOME/bin
# same as above
PATH=$PATH:$HOME/bin
This is why you will often see shell configuration asking to add
things to your ~/.profile file and use snippets like this:
# PATH is already a well known and already-initialized environment variable
# No need to export it
PATH=$PATH:$HOME/path/where/you/installed/program
# This is a new environment variable
# It needs to be exported
export DOTNET_ROOT=$HOME/path/where/you/installed/program
If the DOTNET_ROOT environment variable was not exported, it would
be initialized as a normal variable within the shell and not be made
available to other programs that are looking for this environment
variable.
You can also set environment variables for a single command by passing the environment variable names and values just before the command name.
ENV_VAR=VALUE program
This sets the environment variable ENV_VAR to the value VALUE only
for the program.
The syntax is like this:
var1=value1 [var2=value2 var3=value3 ...] command
A common style used in compiling code is something like this:
CC=gcc make
This runs make with the environment variable CC set to the value
gcc. In turn, make understands this to mean that make should use
gcc as the c compiler.
Let’s go back to the example we saw earlier in the section about defining variables (not environment variables) with extra spaces.
var= value
You can now see that why this runs the program value with the
environment variable var set to the empty value.
Summary
You should now have a really good idea about what bash variables and environment variables are, how they differ, and when you might want to use them.
This post was inspired by this StackOverflow question