Course materials and documentation for DS2002
The goal of this activity is to familiarize you with containerization using Docker and related technologies. Containers are essential for creating reproducible environments, packaging applications with their dependencies, and deploying software consistently across different systems.
Note: Work through the examples below in your terminal (Codespace or local), experimenting with each command and its various options. If you encounter an error message, don’t be discouraged—errors are learning opportunities. Reach out to your peers or instructor for help when needed, and help each other when you can.
Option 1:
If you want to use Docker containers on your own computer, follow the setup guide in ../../setup/docker.md.
Option 2: Alternatively spin up a Linux Ubuntu EC2 instance in AWS.
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh ./get-docker.sh
To pull a container image, find its location from Docker Hub or another registry. This should appear something like:
docker pull godlovedc/lolcow
The pull command downloads the image from Docker Hub to your machine.
To run the default command of the image, execute:
docker run godlovedc/lolcow
Output:
_____________________________________
/ Everything will be just tickety-boo \
\ today. /
-------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
The LolCow container told you a joke (ran a process) and then exited immediately to return to your shell.
Let’s find a new image to explore how we can use a container interactively. Go to Docker Hub and search for an Ubuntu image. Take note of the image name (ubuntu) and choose a tag. The tag is the portion after the :.
To work with a container interactively, append the -it flag to the docker run command. Be sure to add a shell or some other executable program after the image name and replace <tag> with the actual tag you found:
docker run -it ubuntu:<tag> /bin/bash
Note how the prompt has changed to something like this:
root@4489de2c677f
You’re in a bash shell inside the container!
Now, run
cat /etc/os-release
To exit out of the interactive container shell, enter
exit
To view all images you have built or pulled to your computer, run:
docker images
The output may look like this (column names vary slightly by Docker/runtime version):
IMAGE ID DISK USAGE CONTENT SIZE EXTRA
godlovedc/lolcow:latest a692b57abc43 370MB 104MB U
jekyll/jekyll:latest 400b8d1569f1 1.23GB 322MB
mysql:8.0 99d774bf02a4 1.08GB 243MB U
How to read this table:
IMAGE: repository and tag (for example, mysql:8.0) used to pull and run that image.ID: unique image identifier; you can use this instead of image name in commands like docker rmi.DISK USAGE: total local storage used by the image, including shared layers.CONTENT SIZE: size of this image’s own filesystem layers/content.EXTRA: optional runtime-specific metadata. In some runtimes, U indicates an unpacked image. If this column is blank, that is normal.To see all containers running locally:
docker ps
You should see output similar to:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a57e5166fda7 ubuntu:latest "/bin/bash" 4 seconds ago Exited (0) 3 seconds ago heuristic_pare
To see all container instances, including those that have stopped, run this:
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ed9a3ade7cec godlovedc/lolcow:latest "/bin/sh -c 'fortune…" 3 seconds ago Exited (0) 2 seconds ago hardcore_napier
a57e5166fda7 ubuntu:latest "/bin/bash" 5 minutes ago Exited (0) 5 minutes ago heuristic_pare
You can now refer to a specific container by using either the full name heuristic_pare or the first few characters of the container ID, such as a57e.
To inspect all metadata attributes about a running container, such as IP address, or volume mounts, etc.
use the inspect command. This will return a JSON payload of fields:
docker inspect a57e
Try to find the Cmd[] section. It describes the command that’s executed by default.
Each container image has its own filesystem. Let’s check this out by comparing host and container output:
pwd
docker run --rm ubuntu:latest pwd
The first command runs on the host in your active shell. If you’re in this repo’s practice directory it will show something like:
/home/mst3k/ds2002-course/practice/11-containers/
The second command runs pwd inside a temporary Ubuntu container and will show:
/
Similarly, compare the output of ls and docker run --rm ubuntu:latest ls.
To mount a directory from your local workstation into a container when launched, use the -v flag with
a mapping of HOST_DIRECTORY:CONTAINER_DIRECTORY:
docker run -it -v .:/my_folder/ ubuntu:latest /bin/bash
Run ls.
bin boot dev etc home lib media mnt my_folder opt proc root run sbin srv sys tmp usr var
Note the my_folder directory inside the container. Run ls my_folder and you should see the contents of your current host directory, now mounted inside /my_folder.
Now run:
echo "hello from the container" > my_folder/hello.txt
Then exit.
Through this mechanism you can dynamically bring folders and files into the container. Any files you add to my_folder will persist when you exit the container. Pretty cool!
To stop a container:
docker stop heuristic_pare
or
docker stop a57e
Note: Images can only be removed when there is no container instance with that image running anymore.
To delete an image, use the rmi (remove image) command with either the image name:tag or ID.
docker rmi image_name
To delete all unused images:
docker system prune
This directory contains a few container examples. We focus on the mechanism of the build process rather than the specific implementation details underlying each project.
Let’s try the Fortune Teller. The Dockerfile is located in fortune/Dockerfile.
FROM ubuntu:18.04
RUN apt-get update && apt-get install -y --no-install-recommends \
fortune fortunes-min && \
rm -rf /var/lib/apt/lists/*
ENV PATH=/usr/games:${PATH}
ENTRYPOINT ["fortune"]
How this Dockerfile works:
FROM ubuntu:18.04 sets the base operating system image.RUN ... uses the Ubuntu package manager apt-get to install fortune and fortunes-min, then removes cache files of the apt package manager to keep the image smaller.ENV PATH=/usr/games:${PATH} adds the install location of fortune to the executable path.ENTRYPOINT ["fortune"] sets the default startup command, so docker run fortune:latest executes the fortune command inside the container and prints a fortune immediately.Let’s build it:
cd fortune
docker build -t fortune:latest .
Execute docker images to confirm the new fortune:latest image is ready.
And then run it:
docker run --rm fortune:latest
Run it a few more times for additional fortune telling.
On many clusters (including UVA’s), you cannot run the Docker daemon as an ordinary user: shared systems avoid giving everyone root-equivalent features that Docker traditionally needed. Apptainer (formerly Singularity) is a common alternative: you run container images as yourself, and you typically execute immutable .sif image files instead of talking to a long-lived daemon.
Go to your home directory. On the HPC system you also need to load the apptainer software module.
cd ~
module load apptainer
The general form is apptainer pull <output.sif> <transport>://<image reference>. For images on Docker Hub, the transport is docker (for example docker://ubuntu:latest or docker://godlovedc/lolcow:latest).
apptainer pull lolcow-latest.sif docker://godlovedc/lolcow:latest
apptainer pull … docker://… downloads from a registry (often Docker Hub) and builds a local .sif file. This .sif file is self-contained and you can move it to other locations.
Pull a few more images (still in the directory where you want the .sif files):
apptainer pull ubuntu-latest.sif docker://ubuntu:latest
apptainer pull mysql-8.0.sif docker://mysql:8.0
apptainer run lolcow-latest.sif
apptainer run executes the container’s default entrypoint. In this case it will run the script that tells you a joke.
Alternatively you can use the apptainer exec command:
apptainer exec ubuntu-latest.sif cat /etc/os-release
When you use apptainer exec you need to specify the command to execute inside the container after the image filename, in this case cat /etc/os-release.
apptainer shell ubuntu-latest.sif
Similar to Docker volume mounts, Apptainer can bind host paths into the container for shell, exec, and run. Use --bind (short form -B) with host_path:container_path.
apptainer shell --bind .:/my_folder ubuntu-latest.sif
You can repeat --bind (or -B) for multiple mappings. See Apptainer bind paths in the official docs.
To run a container in detached mode, append the -d flag to the docker run command with the
container image name:
docker run -d --name mysql-dbhost -e MYSQL_ROOT_PASSWORD=my-secret-pw mysql:8.0
Detached mode means the container runs in the background and your terminal prompt returns immediately. Use this when you want a long-running service (such as MySQL) to stay up while you continue using the same terminal for other commands. For example, start MySQL in detached mode, then run docker ps to confirm status before connecting with a client.
To inject ENV variables into a container, add the -e flag with a Key-Value mapping when you run the container:
docker run -it -e MYKEY=myvalue ubuntu:latest /bin/bash
To map a local port from a container to your workstation, use the -p flag with a mapping of
HOST_PORT:CONTAINER_PORT. This allows you to view/test a service listening on that port:
docker run -d --name mysql-dbhost -e MYSQL_ROOT_PASSWORD=my-secret-pw -p 6033:3306 mysql:8.0
To view the output logs from a running container:
docker logs 2ad2
Finally, to “hop” into a running container that is running in detached mode, use the exec -it command
against the ID or name of the running container. Be sure to add a shell or other executable after the name
of the container.
docker exec -it 2ad2 /bin/bash
whalesayThis is a famous demo container created by Docker to demonstrate an interactive container image that takes input from a user. To build it, cd into this directory:
docker build -t whalesay .
To run it, simply append a command or quote or joke at the end of the run command:
docker run whalesay Hello everyone!
convertThis is a simple Python ETL pipeline. You can build it locally by changing into its directory and running:
docker build -t converter .
To try running it on your own, just map a directory to the /data path of the container and pass the
fictional ID 0987654321 as a parameter:
docker run -v ${PWD}:/data converter -i 0987654321
Multi-stage builds allow you to use multiple FROM statements in a Dockerfile, which helps create smaller final images by separating build dependencies from runtime dependencies:
# Build stage
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt
# Runtime stage
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "app.py"]
Docker Compose allows you to define and run multi-container Docker applications using a YAML file. This is useful for orchestrating services that need to work together:
version: '3.8'
services:
web:
build: .
ports:
- "5000:5000"
redis:
image: redis:alpine
ports:
- "6379:6379"
Note the build: . statement for the web service. The . refers to the current directory and it is assumed that it contains a Dockerfile with the image build instructions. In contrast, the redis service will utilize an existing image redis:alpine from a public repository.
Run with: docker compose up
latest.dockerignore to exclude unnecessary filesFor production environments, consider container orchestration platforms:
If your host has NVIDIA GPUs and drivers available, Apptainer can expose them inside the container with the --nv flag.
apptainer exec --nv pytorch-latest.sif python -c "import torch; print(torch.cuda.is_available())"
You can also test GPU visibility with:
apptainer exec --nv pytorch-latest.sif nvidia-smi
If GPUs are configured correctly, these commands should report at least one CUDA device.
Apptainer images include a default runscript. If you mark the .sif file as executable, you can launch it directly instead of typing apptainer run each time:
chmod +x lolcow-latest.sif
./lolcow-latest.sif
This is functionally similar to:
apptainer run lolcow-latest.sif