Shellscripting with Bash

Naming your shell script

Don’t choose an already taken filename like test or ls for your shell script. You can type foo to check if the command already exists or not. It makes sense to add the suffix .sh to your shell script file name, but it only serves as a visual hint for users. By convention you should use underscores to separate words in the filename.

Make your shell script executable just for you with chmod u+x my_script.sh or for all with chmod a+x my_script.sh.

To locally invoke your shell script enter ./my_script.sh. To make your shell scripts available system wide (for you) you can create a bin folder, cd into it and then add that folder (temporarily) to your PATH variable with PATH=$PATH:$(pwd). You can add that folder permanently by changing your user profile file.

Defining the shell in your script

Start your file with #!/bin/bash to let the OS know which shell you intent to use when your script runs. It is a good practice to add comments such as script name, what the script does and author:

#!/bin/bash

# run_something
# This script does ...
# Author: ...

Writing to console

echo

You can echo "some things" or echo 'some things' or echo some things without quotes. Add -n to suppress new lines. Add -e to allow escape sequences like tab \t or backspace \b. But those commands are not portable to non-bash shells.

printf

You are better off using printf which does not print newlines by default like echo does: printf "Hi my name is %s\n" $USER.

Works with multiple arguments: printf "p%st\n" a e i o u prints pat, pet, pit, pot, put – each on a new line.

The following will print a directory listing in columns of width 20:

printf "|%20s |%20s |%20s |\n" $(ls)

To store the value to a variable instead of displaying it use printf -v my_var "Hey %s" $USER.

Heredoc syntax

cat <<END
  everything that I
  write here will be
  written to console exactly
  the same way as you can see
END

Using commands in your script

Use command substitution such as $(date) within your script.

In the following example we store a command in a variable and try to run it. Some of the attempts fail:

$ abc='ls -l "/tmp/test/my dir"'

$ $abc
ls: cannot access '"/tmp/test/my': No such file or directory
ls: cannot access 'dir"': No such file or directory

$ "$abc"
bash: ls -l "/tmp/test/my dir": No such file or directory

$ bash -c $abc
'my dir'

$ bash -c "$abc"
total 0

$ eval $abc
total 0

$ eval "$abc"
total 0

Variables

You can store strings to variables with my_var="whatever" (important: no space around =) and use the variable like echo $my_var or echo ${my_var}file.txt.

Again, there must not be any spaces before and after =. It is a good practice to use double quotes around your variable values. Use lowercase names for your variables because uppercase are considered reserved by the OS such as $HOME, $PATH, $EDITOR or $USER.

Dynamic variables

In the following article we define two fixed variables target_staging and target_prod and one dynamic variable final_target that is composed based on the user-provided command argument $1: For example my_script.sh staging will result in Final target: /var/www/staging.

target_staging="/var/www/staging"
target_prod="/var/www/prod"
final_target=target_$1

echo "Final target: ${!final_target}"

Variables can have attributes that can be set with - and removed with +.

Integer variables

You can declare -i my_var to only contain integers, but it won’t give you an error if you try to assign a string, instead it will be set to 0. Setting the i attribute allows to do calculations in a “friendly” way.

declare -i result;
result="3+5"
echo $result

Results in 8, but without declare it is "3+5". An alternative that does not need declare is to use let:

let result="3+5"
echo $result

or better, instead of let you should use the recommended (( )) syntax, which also allows spaces:

(( result=$(ls -lsa|wc -l) ))
echo $result

If you add a $ in front of this syntax, then you have a substitution, meaning the calculation gets evaluated and can be assigned to a variable:

result=$((5 + 3))
((++result))
echo $result #9

To get a random number use $RANDOM.

Constants

You declare a constant (a readonly variable) with declare -r my_constant="something". Trying to change the value will result in an error.

Exporting variables

By default variables are local to the script or to the terminal session respectively:

#!/bin/bash
# outer.sh
my_var="out"
echo "Outer before: $my_var"  # will be 'out'
./inner.sh
echo "Outer after: $my_var"   # will be 'out'
echo "Outer: $my_other"       # will be empty
#!/bin/bash
# inner.sh
echo "Inner before: $my_var"   # will be empty
my_var="in"
echo "Inner after: $my_var"    # will be 'in'
my_other="other in"

Results in:

Outer before: out
Inner before: 
Inner after: in
Outer after: out
Outer: 

What you can do to make make outer variables usable in inner scripts us to change outer.sh to either

  • source ./inner.sh or
  • declare -x my_var="Hey" or
  • export my_var="Hey"
#!/bin/bash
# outer.sh
export my_var="out"           # we export the variable to make it available in inner.sh
echo "Outer before: $my_var"  # will be 'out'
./inner.sh
echo "Outer after: $my_var"   # will be 'out'
echo "Outer: $my_other"       # will be empty
#!/bin/bash
# inner.sh
echo "Inner before: $my_var"   # will be empty
my_var="in"
echo "Inner after: $my_var"    # will be 'in'
my_other="other in"

Will result in:

Outer before: out
Inner before: out
Inner after: in
Outer after: out
Outer: 

Think of export my_var as “implanting” my_var into inner.sh.

As you see, exporting that variable has no effect on $my_other, it is still empty. To make $my_other (which is set in inner.sh) available to outer.sh, we source inner.sh:

#!/bin/bash
# outer.sh
my_var="out"
echo "Outer before: $my_var"  # will be 'out'
source ./inner.sh             # source it
echo "Outer after: $my_var"   # will be 'in'  <-- this will be overridden with inner value
echo "Outer: $my_other"       # will be 'in'
#!/bin/bash
# inner.sh
echo "Inner before: $my_var"   # will be 'out'
my_var="in"
echo "Inner after: $my_var"    # will be 'in'
my_other="other in"

Will result in:

Outer before: out
Inner before: out
Inner after: in
Outer after: in
Outer: in

Think of source inner.sh as including/yielding it into outer.sh.

Bash Caveats

Tilde expansion

#!/bin/bash
export PROJECT_DIR="~/code/app"    # don't do this: tilde does not expand in quotes

Expansion order

#!/bin/bash
user_variable='$1'
echo "export USER_VAR=$user_variable" > ~$user_variable/myfile.txt

will result in an error:

./inner.sh: line 3: ~$1/myfile.txt: No such file or directory

Here is why: Tilde (~) is expanded before variables are evaluated. So bash tries to find a user $1 here.

In bash the order of expansions is: brace expansion, tilde expansion, parameter, variable and arithmetic expansion and command substitution (done in a left-to-right fashion), word splitting, and pathname expansion.

Arrays

# Array using compound assignment
my_array=(zero one two)

# or with index number

my_array=([1]=10 [2]=20 [3]=30)

# or

my_array[0]="zero"
my_array[1]="one"
my_array[2]="two"
my_array[5]="five"

echo "${my_array[1]}" # one
echo "${my_array[@]}" # zero one two
echo "${my_array[*]}" # zero one two
echo "${#my_array[@]}" # 4 (length)
echo "${!my_array[@]}" # 0 1 2 5 (indices)

Let me stress it: Do not use commas to separate array values in shell scripting.

# add to array
my_array+=('foo')
declare -p my_array
# logs: declare -a my_array=([0]="zero" [1]="one" [2]="two" [5]="five")

Reading command line arguments

When running my_script.sh hey you then in the script $1 contains hey, $2 contains you. 10th and following arguments need curly braces ${10}.

$* contains hey you and "$*" with double quotes returns “hey you”.

$0 holds the name of the script as it was called which can be different from the script file name:

  • If you run the script and is has to be looked up on the $PATH first, then the whole absolute path name is returned.
  • If you call it using a relative path, then the relative path is returned.
  • If you call it via a symlink, then the name of the symlink is returned.

$# contains the number of script arguments. $? contains exit status of the last command. To get the length of a string variable use ${#my_var}.

$@ contains all the arguments and is equivalent to $1 $2$N. If you use "$@" (with double quotes) then each argument is in double quotes and thus allow to contain spaces. This also means that my-command "first arg" second third would recognize 4 arguments if you use $@ or $*, but it would recognize 3 arguments if you use "$@" in your script – what you usually want.

You can shift your arguments, so $2 -> $1, $3 -> $2 etc. This comes in handy when you want to process each argument one by one and don’t want to use a for loop for that. You can also shift(2) two or more.

getopts

To read in command line options like my-command -a -b (in this case -a and -b are the options) you use getopts ab. To read an option argument, e.g. my-command -f myfile, you use getopts f:thefile.

You can while-loop your options in your script:

while getopts "a:b" opt; do
  case $opt in
    a)
      flag="whatever"
      ;;
    b)
      [[  ${OPTARG} =~ ^[0-9]+$ ]] || { echo "${OPTARG} is not a number" >&2; exit 1; }
      somenum="${OPTARG}"
      ;;
    :)
      echo "Option -${OPTARG} is missing an argument"
      exit 1
      ;;
  esac
done

In mycommand -a -b fileA fileB, fileA will not be processed as an option because it does not have a dash. fileA is just another argument, unless you define it as an option argument for -b in your script by using b:myFile. The option -- (double dash) will be seen as the end of options as well, e.g. mycommand -a -b -- -diff. This allows you to use add arguments at the end of the command line that will not be processed by getopts, but that will start with a dash anyway.

The current option argument is held in ${OPTARG}. OPTIND simply holds the index of the next option that getopts was going to handle.

Let’s say you have mycommand -a hey -b you stop. Getopts does not handle stop because it is not an option, it is an argument. To access the value of stop in your script you can ‘shift away’ the options with shift $(( OPTIND -1 )). Now stop is accessible as $1.

By default getopts handles errors for you, for example if you run a command with an option that you did not specify in your script, then it will output an error. Additionally the option variable NAME contains "?".

If you want to process errors yourself, then you have to start your getopts with a colon getopts ":a". If there is an unknown option it will still add a "?" to variable NAME but also put the option in OPTARG. If there is a missing option argument then NAME contains ":".

Reading user input

Use read my_var to get user input which gets stores in my_var. If you omit mv_var then it gets implicitly stored in a variable REPLY. You can specify more than one variable to only store the first word in the first variable and the rest in the next variables. Instead of a space you can set the IFS (Input Field Seperator) variable to choose another delimiter.

#!/bin/bash

IFS=";"
read a b
echo $a and $b

# if input is 'one;two;three'
# then output would be 'one and two three'

Use read -p "Enter your name" my_var to prompt a message. Use -n to read in a specified number of characters until a new line character or use -N for multi-line. -s will suppress output (useful for passwords), meaning you don’t see what you are typing in. You probably always want to use -r to prevent that escape sequences and line continuation is also read.

#!/bin/bash

echo -n "Are you sure (Y/N)? "

answered=
while [[ ! $answered ]]; do
  read -r -n 1 -s answer
  if [[ $answer = [yY] ]]; then
    answered="yes"
  elif [[ $answer = [Nn] ]]; then
    answered="no"
  fi
done

printf "\n%s\n" $answered

Debugging

To output every executed line you can debug your script with #!/bin/bash -x. To only debug a certain lines you can add a line set -x to start debugging followed by set +x a few lines later to end debugging.

Conditional expressions

if [[  $str  ]]; then
  # Code to execute if testcode has code 0
elif [[ $foo ]]; then
  # Code
else
  # Code to execute if testcode has code other than 0
fi 

or on command line:

if testcode; then successcode; else failcode; fi

The double square brackets [[ and ]] is a special syntax that works in bash. No quotes are needed around variables.

Single square brackets [ and ] also exist; it is not a build-in syntax but an alias for the test command. They are harder to use and it is easier to make mistakes. That’s why they should only be used for portability.

if [[ $str ]] is true if str is not empty. Note the mandatory spaces after [[ and before ]].

To check if variable is empty:

if [ -z "$var" ]
then
      echo "\$var is empty"
else
      echo "\$var is NOT empty"
fi

Simple check if variable is NOT empty:

[ "$var" ] && echo "not empty" || echo "empty"

To check using conditional and/or:

if [[ $1 != "staging" || $1 != "prod" ]]; then
    # ...
fi

if [[ $1 = "deploy" && $2 = "run" ]]; then
    # ... 
fi

It is deprecated and not recommended to use -o instead of || and -a instead of &&.

You can compare if $str holds a value with [[ $str = "something" ]]. Note that it is only one = and that is has spaces around. Without spaces around = it would be an assignment that always returns true.

To negate add a ! like [[ ! $str ]] or [[ $str != "something" ]].

Use [[ -e $filename ]] to check if file exists and -d for directory.

To check for a pattern for example if a user input contains Y or y:

while [[ ! $answered ]]; do
if [[ $answer = [yY] ]]; then
  answered="yes"
elif [[ $answer = [Nn] ]]; then
  answered="no"
fi

See all available comparisons by running help test.

Return codes

The value returned by your script upon exit can be between 0 and 255, where only 0 means success:

if [[ ! $1 ]]; then
  echo "Missing argument"
  exit 1
fi

Comparing numbers

Bash only handles integers. Use [[ 1 -eq 2 ]] for equality testing. Others are -ne (not equal), -lt (less than), -gt (greater than). Do not use =, >, < for numbers, because they work on strings only to compare alphabetical order!

Standard Streams

numberFile descriptorspecial file
0stdin/dev/stdin
1stdout/dev/stdout
2stderr/dev/stderr

There is also /dev/null which discards all data that was sent to it.

Redirecting streams

To discard errors from a command you redirect 2 (stderr) to /dev/null:

cmd 2> /dev/null

So more abstract: You use N> to redirect a stream (N is the stream number).

To redirect stdin (1 or leave empty because it is default) to stderr stream:

>&2 (or 1>&2)

Or to redirect all errors to standard output (e.g. to log error and normal output to a file):

2>&1

So more abstract: You use N>&N to redirect to a stream (N is the stream number). Don’t use the deprecated &> or >&.

Loops

The following while loop imitates the cat command, because it logs out the a given input file:

while read -r; do
  printf "%s\n" "$REPLY"
done <"$1"
until test; do
  # ...
done
for num in 1 2 3 4; do
  echo $num
done
# !/bin/bash 
# To declare static Array  
arr=(1 2 3 4 5) 
  
# loops iterate through a  set of values until the list (arr) is exhausted 
for i in "${arr[@]}"
do
    # access each element as $i 
    echo $i 
done

Using for loop with a wildcard * to iterate over all files of a certain file extension:

for f in *"txt"; do
  base=$(basename "$f" "txt")
  echo mv "$f" "${base}zip"
done

Assuming you have a.txt, b.txt and c.txt it will output:

mv a.txt a.zip
mv b.txt b.zip
mv c.txt c.zip
for (( i=0; i<length; ++i )); do
  # ...
done

Of course you can use break and continue within loops.

Case

case $1 in
  cat)
    echo "meow";;
  dog)
    echo "woof";;
  bird|birds)
    echo "chirp";;
  *)
    echo "unknown animal"
esac

Command groups

{ cmd1: cmd2; cmd3; } groups commands together and regards them as one.

mkdir newdir && cd newdir will execute second statement (cd newdir) only if first statement was successful.

[[ $1 ]] || { echo "missing argument" >&2; exit 1; } will execute statement only if previous one failed.

Functions

The recommended and most compatible syntax to declare functions is name() { ... ], even though function name() { ... } and function name { ... ] exist.

Functions are like scripts within your script, meaning that they can use redirection, obtain arguments and also – if named badly – can overwrite other existing commands, if you call your function ls for example.

Arguments that you pass in functions can be retrieved with $1, $2 and so on. The return value of a function is held in $?.

sum() {
  return $(( $1 + $2 ))
}
sum 4 5
echo $?

Of course you can simply echo directly within the function, instead of returning a value.

sum() {
  echo $(( $1 + $2 ))
}

; invoke the function
sum 4 5

; or use it as command substitution
echo $(sum 4 5)

In this function we test if a function argument starts with an ‘a’:

starts_with_a() {
  [[ $1 == [aA]* ]];
  return $?
}

if starts_with_a ax; then
  echo "Yes"
else
  echo "No"
fi

If you don’t exit a function with return, then the return status of the last statement will be returned implicitly.

Bash variables are globally visible, unless you declare them inside a function or use the local keyword.

You can also export -f my_function to make the function available to sub processes.

A function that prints errors in case an argument is not a number might look like this:

#!/bin/bash

usage() {
cat <<END
Usage: my-script [-a] [-b]

 -a Use a to abort
 -b Use b to bark
END
}

error() {
  echo "Error: $1 "
  usage
  exit $2
} >&2

isnum() {
  [[ $1 =~ [0-9]+$ ]]
}

isnum "a" || error "Not a number" 1

Note that we redirect the output of the whole error function to stderr using >&2.

Function variables and piped commands

The following script does not work as expected. One could expect that the echo $count on the last line will be the same as the echo $count in the function, but it is not.

#!/bin/bash

declare -i count=0

count_lines () {
  while read -r; do
    ((++count))
  done
  # this line will count correctly
  echo $count
}

$* | count_lines
# this count will always be 0
echo $count

The reason is that when piping commands as we do with $* | count_lines, count_lines will be executed in a separate sub shell, and this sub process does not have access to $count. A solution could be to store the output to a temporary file that can be read in the sub process.

String manipulation

To get the length of a string variable use ${#my_var}.

Removing from beginning and end of a string

#!/bin/bash

my="/this/is/my/path/file.txt"

# remove 'this' from begin of string (shortest possible match)
echo ${my#*/}
# Result: "this/is/my/path/file.txt"
# removes the first slash

# remove 'this' from begin of string (longest possible match)
echo ${my##*/}
# Result: "file.txt"
# removes everything including the last slash

# remove 'this' from the end of string (shortest possible match)
echo ${my%.*}
# Result: "/this/is/my/path/file"
# removes the extension '.txt'

# remove 'this' from the end of string (longest possible match)
echo ${my%/is*}
# Result: "/this"
# Removes everything from the end including '/is'

Search and replace

my_var="/this/is/my/path/and/my/file.txt"

# Replace first match
echo ${my_var/my/your}
# Output: /this/is/your/path/and/my/file.txt

# Replace all matches
echo ${my_var//my/your}
# Output: /this/is/your/path/and/your/file.txt

You can combine the syntax from begin (# or ##) or end (% or %%) of string with the search and replace. It looks like ${my_var/#pattern/string}.

Conditional expression patterns

==, != operators in [[..]] do pattern matching. == is the same as =. [[ $var == pattern ]] returns true when $var matches the pattern, for example [[ $filename == *.txt ]] returns true if the filename ends on .txt. If you want to match against string instead of pattern, then you need to use double quotes, e.g. [[ $var == "[0-9]*" ]] would match the string [0-9]*.

Regular expression matching

Use =~ for POSIX extended regular expressions.

Different way to run your shell script code

Usually you start the first line of your script with a hash-bang like !/bin/bash, make it executable like chmod u+x myscript and run it like ./myscript. The dot is an alias for the source command and it imports your code in the current shell process. This technique also allows you to refactor your commonly used (shared) functions in a separate file and then “import” that file in another script using source.

Another way is to run it like bash myscript, without the need to set permissions and with the advantage to pass additional options, for example to debug it with bash -x myscript.

You can run a command in the background by adding a & like myscript &. Your script will be disconnected from the interactive session and will suspend if it tries to read input from the terminal.

If you have a long-running script that you want to keep running even after you exit your terminal session, then you use nohup myscript &, which stands for no hangup. It makes sense to let your background script run with a lower priority by using nice myscript or combine everything to nohup nice myscript &.

Redirecting your script streams with exec

exec lets you redirect your standard or error output of your script to separate files.

#!/bin/bash

if [[ $1 == "-l" ]]; then
  exec >mylogfile.txt 2>errors.txt
fi

declare -i i=0
while true; do
  echo "still here $((++i))"
  sleep 1
done

You can run your script at a specific time using at -f myscript noon tomorrow. Cron can run your script according to a schedule (e.g. hourly, daily, weekly). On MaxOS you use launchd. On Ubuntu you want to take a look at Upstart.

Bash options

-x debugs your script by printing each command with its arguments to the console. -u gives an error when using an uninitialized variable and exits the script. -n read commands but does not execute them. -v will print each command as it is read. -e exits a script whenever a command fails (unless you check error codes yourself with if, while, until, ||, &&). Many of those options should not be used in a production environment.

shopt also sets bash behaviour with -s and unsets them with -u. For example shopt -s nocaseglob ignores case with pathname expansion. Or you could enable a more powerful extended pattern matching with shopt -s extglob. Run shopt -s dotglob if you want to include hidden files with pathname expansion.

Date and Time

# example: 2021-02-23 16:01:13
printf '%(%Y-%m-%d %H:%M:%S)T\n' -1 

About Author

Mathias Bothe To my job profile

I am Mathias from Heidelberg, Germany. I am a passionate IT freelancer with 15+ years experience in programming, especially in developing web based applications for companies that range from small startups to the big players out there. I create Bosycom and initiated several software projects.