Linux containers are a way to build a self-contained environment that includes
software, libraries, and other tools. This
guide shows how to submit jobs that use Docker containers.
Typically, software in CHTC jobs is installed or compiled locally by
individual users and then brought along to each job, either using the
default file transfer or our SQUID web server. However, another option
is to use a container system, where the software is installed in a container
image. Using a container to handle software can be advantageous if the
software installation 1) has many dependencies, 2) requires installation to a specific
location, or 3) "hard-codes" paths into the installation.
CHTC (and the OSG) have capabilities to access and start
containers and run jobs inside them. This guide shows how to do this
for Docker containers.
In order to run your job inside a Docker container, you will need to:
- Find or prepare a Docker container image for your jobs to use
- Test the container locally
- Make a few changes to your submit file
1. Getting a Docker Container Image
To run a Docker job, you will first need access to a Docker container image that
has been built and placed onto the DockerHub website.
There are two primary ways to do this.
A. Pre-existing Images
The easiest way to get a Docker container image for running a job
is to use a public or pre-existing
image on DockerHub. You can find images by getting an account
on DockerHub and searching for the software you want to use.
An image supported by a group will be continuously updated and the versions
will be indicated by "tags". We recommend choosing a specific tag (or
tags) of the container to use in CHTC.
B. Build Your Own Image
You can also build your own Docker container image and upload it to DockerHub.
See the Docker
documentation for more information.
2. Testing the Container
The next step is to test the container
on your own computer before submitting a job to
CHTC. Note that all the steps below should be run on your
own computer, not in CHTC.
If you created your own container image on your computer, you can skip
steps A and B and start with C.
A. Install Docker to your computer
Download, install, and start the
Docker Community Edition for your operating system. It sometimes takes some
time for Docker to start, especially the first time.
B. "Pull" the container image that you're using
We need to have a local copy of the Docker container image in order to
test it. To do this, choose which image you want to use and the tag for
the version you want. The syntax for the full container image name will be
pull a copy of this Docker container image to your computer by running the following
from either a Terminal (Mac/Linux) or Command Prompt (Windows):
$ docker pull username/image:tag
C. Choose your executable
There are two ways to run software inside a Docker container:
- Use a script that you transfer into the container, using software
installed in the container.
- Use a script or executable program already inside the container.
Instructions for each of these use cases is below.
1. Using your own script (recommended)
Write a script that runs the steps of your job.
Unlike in many of our guides, this
script doesn't need to be written in a language
like bash; instead, it can use something like Python or R directly
from inside the container.
Note, that it is important that any script that is run this way
will need a header at the top, indicating what kind of script it is. Some
common headers include:
Do I need an executable script? If your job only needs to run
one command you don't need a script to serve as the jobs executable. See
2. Using an executable already inside the container
If the executable is already in the container, you simply
need to know what command you need to run to use it.
D. Create a folder with job files
For testing, we need a folder on your computer to stand in for the directory that
HTCondor creates for running your job. Create a folder for this purpose
on your Desktop. The folder's name shouldn't include any spaces. Inside
this folder, put all of the files that are normally inside the working
directory for a single job -- data, scripts, etc. If you're using your own
executable script, this should be in the folder.
Open a Windows Command Prompt or Mac/Linux Terminal to access that folder,
Replace "folder" with the name of the folder you created.
D. Start the Docker container
We will start the desired Docker container
in order to see if it works. First make sure Docker is running. Then
run the command below to start the container. The command
can be run verbatim except for the
tag; these should be whatever you used
to pull or tag the container image.
$ docker run --user $(id -u):$(id -g) --rm=true -it \
-v $(pwd):/scratch -w /scratch \
$ docker run --rm=true -it -v %CD%:/scratch -w /scratch username/imagename:tag /bin/bash
For Windows users, a window may pop up, asking for permission to share your
main drive with Docker. This is necessary for the files to be placed inside
E. Test the job
Your command line prompt should have changed to a number (this represents the
running container instance). We can now see if the job would complete
successfully! If you have an executable script, you can run it like so:
If your "executable" is software already in the container, run the appropriate
command to use it.
The following commands may not be necessary, but if you see messages
about "Permission denied" or a bash error about bad formatting, you may
want to try one (or both) of the following:
You may need to add executable
permissions to the script for it to run correctly:
bob@12335:/scratch$ chmod +x exec.sh
Windows users who are using a bash script may
also need to run the following two commands:
bob@12335:/scratch$ cat exec.sh | tr -d \\r > temp.sh
bob@12335:/scratch$ mv temp.sh exec.sh
exec.sh with the name of your own executable.
When your test is done, type "exit" to leave the container:
If the program didn't work, try searching for the cause of the error
messages, or email CHTC's Research Computing Facilitators.
If your local test did run successfully, you are now ready to set up
your Docker job to run on CHTC.
3. Submit File Customization
Jobs that run inside a Docker container will be almost exactly the
same as "vanilla" HTCondor jobs. There are three needed customizations
to the submit file: one to indicate which Docker container to use, one
to request the right operating system, and the usual list of your
particular executable and input files.
A. Using a Docker Image
Start with a usual CHTC submit file like the one shown in our
Hello World guide. Then, make the following
- Change the universe from "vanilla" to "docker":
universe = docker
- Add a line to indicate which Docker image you want to use for running your job:
docker_image = user_name/image_name:tag
When your job starts, HTCondor will pull the indicated image from DockerHub,
and use it to run your job.
B. Executable and Input Files
Your wrapper script from the test on your computer should be listed
as the job's
executable. The other needed files from your
test directory should be listed in