Course materials and documentation for DS2002
The goal of this activity is to familiarize you with scripting in bash. Bash scripting is essential for automating tasks, processing data, orchestrating workflows, and building reusable tools that can save time and reduce errors.
Note: Work through the examples below in your terminal (Codespace or local), experimenting with each command and its various options. If you encounter an error message, don’t be discouraged—errors are learning opportunities. Reach out to your peers or instructor for help when needed, and help each other when you can.
If the initial examples feel like a breeze, challenge yourself with activities in the Advanced Concepts section and explore the resource links at the end of this post.
All scripts should be written in a way that takes into account several factors:
| Error out gracefully –> set -e / error codes –> | exit 1; |
A well-formatted bash script begins with a “shebang” line:
#!/bin/bash
that points to the full path of the bash shell. This may differ from one environment
to the next.
To make any bit of code executable, use chmod 755 against it.
Any binary executables used in a shell script should be invoked using their full paths. This is to avoid any ambiguity and preempt any errors of a shell not being able to find the command.
For example, when invoking the aws command-line in a script you would normally call
/usr/local/bin/aws
To determine the full path of an executable in a given system, use the which command:
which aws
In most bash shell scripts you may set -e near the head of the script. This flag
tells the script that, upon any error, it should escape/exit the script and stop running.
This is important since to proceed past an error may produce very bad results or
unintended consequences.
Another option is to use conditional, such that when a specific line of the script
fails to execute, the failed line can exit with a non-zero code. This can be a useful
output for debugging
If you need a deliberate pause in the middle of a script, simply sleep 5 for a 5-second
pause, etc. This may be especially useful in the midst of try logic.
Remember that $0, $1, $2, etc. are reserved parameters bash understands as positional
arguments when invoking from the command-line:
$0 is the invoking script itself$1 is the first parameter after the script name$2 is the second parameter …positional-args.sh
#!/bin/bash
echo "$0 <-- invoking script"
echo "$1 <-- first parameter"
echo "$2 <-- second parameter"
returns the following output:
$ ./positional-args.sh bananas blueberries
./positional-args.sh <-- invoking script
bananas <-- first parameter
blueberries <-- second parameter
Start your if with a comparison, end with fi.
if [[ $VAR -gt 10 ]]
then
echo "That number is greater than 10."
else
echo "Your number is pretty small!"
exit 0;
fi
env gives you all environment variables for your session. This may vary
for an unattended script (without you around).
Add environment variables in bash:
export VARIABLE=value-of-variable
Use full paths to your binaries to avoid your unattended script being unable to locate a binary. Just because you can run it by hand does not mean it can run without you around.
A simple-yet-valuable step in your scripting is to log. You can log every action taken by the script, or limit logging to successes or failures.
A common format for logging might be a snippet like this:
# First establish the datetime:
NOW=$(date +"%m-%d-%Y-%H:%M:%S")
echo $NOW " OK - Successfully processed " $FILENAME >> /var/log/output.log
The result would be a single file building with each row as it is logged.
Note the >> to append to a file instead of overwriting it!
One of the most useful habits you can develop as a programmer is adding comments to your code. This explains each chunk of code but might also justify why a particular choice has been made. This will be invaluable to you, when you come back to the code two years later, or when your code is shared with others.
bash that does two things:
bash script to retrieve the log file found in retrieve-file.sh.python3 script to parse that file and write the outputpython script that does both tasks. See the Python scripting section for Python-specific guidance.