Linux containers are a way to build a self-contained environment that includes
software, libraries, and other tools. CHTC
currently supports running jobs inside
In order to run your job inside a Docker container, you will need to:
- Find or prepare a Docker container for your jobs to use
- Make a few changes to your submit file
1. Getting a Docker Image
To run a Docker job, you will first need access to a Docker image that
has been built and placed onto the DockerHub website.
There are two primary options for doing this.
A. Pre-existing Images
The easiest way to get a Docker image for running a job
is to use a public or pre-existing
image on DockerHub. You can find images by getting an account
on DockerHub and searching for the software you want to use.
B. Build Your Own Image
You can also build your own Docker image and upload it to DockerHub.
See the Docker
documentation for more information.
2. Submit File Customization
Jobs that run inside a Docker container will be almost exactly the
same as "vanilla" HTCondor jobs. There are three needed additions
to the submit file: one to indicate which Docker container to use, one
to request the right operating system, and
another to make sure that your job will use the right software.
A. Using a Docker Image
Start with a usual CHTC submit file like the one shown in our
Hello World guide. Then, make the following
- Change the universe from "vanilla" to "docker":
universe = docker
- Add a line to indicate which Docker image you want to use for running your job:
docker_image = user_name/image_name
user_name is the username on DockerHub and the
image_name is the name of the image under that user's name.
When your job starts, HTCondor will pull the indicated image from DockerHub,
and use it to run your job.
B. Using the Right Operating System
Docker will run most successfully on our new operating system,
CentOS7. To request computers that are running CentOS7, add this line
to your submit file:
requirements = (OpSysMajorVer == 7)
For information on how to combine this requirement with other requirements
your job may have, see our CentOS 7 guide.
C. Using Your Desired Software
There are two ways to run software inside a Docker job:
Instructions for each of these use cases is below.
- Use a script or executable program already inside the container.
- Use a script that you transfer into the container, using software
installed in the container
1. Using an executable already inside the container
List the full path to this executable in your submit file.
executable = /usr/bin/Rscript
If, as in the example above, the executable is a program like R or
Python, you will want to list the name of your script as an argument,
and make sure the script is also listed in the "transfer_input_files"
line of the submit file.
2. Using your own script
Give the name of your script as the "executable", just as you
would for a normal job submission. Unlike in many of our guides, this
script doesn't need to be a "wrapper" script written in a language
like bash; instead, it can use something like Python or R directly
from inside the container.
executable = run_tensorflow.py
Note, that it is important that any script that is run this way
will need a header at the top, indicating what kind of script it is. Some
common headers include: