Powered by:
Open Science Grid
Center for High Throughput Computing

Docker Jobs

Overview

Linux containers are a way to build a self-contained environment that includes software, libraries, and other tools. CHTC currently supports running jobs inside Docker containers.

In order to run your job inside a Docker container, you will need to:

  1. Find or prepare a Docker container for your jobs to use
  2. Make a few changes to your submit file

1. Getting a Docker Image

To run a Docker job, you will first need access to a Docker image that has been built and placed onto the DockerHub website. There are two primary options for doing this.

A. Pre-existing Images

The easiest way to get a Docker image for running a job is to use a public or pre-existing image on DockerHub. You can find images by getting an account on DockerHub and searching for the software you want to use.

Sample images:

B. Build Your Own Image

You can also build your own Docker image and upload it to DockerHub. See the Docker documentation for more information.

2. Submit File Customization

Jobs that run inside a Docker container will be almost exactly the same as "vanilla" HTCondor jobs. There are three needed additions to the submit file: one to indicate which Docker container to use, one to request the right operating system, and another to make sure that your job will use the right software.

A. Using a Docker Image

Start with a usual CHTC submit file like the one shown in our Hello World guide. Then, make the following two changes:
  1. Change the universe from "vanilla" to "docker":
    universe = docker
  2. Add a line to indicate which Docker image you want to use for running your job:
    docker_image = user_name/image_name
  3. where user_name is the username on DockerHub and the image_name is the name of the image under that user's name.

When your job starts, HTCondor will pull the indicated image from DockerHub, and use it to run your job.

B. Using the Right Operating System

Docker will run most successfully on our new operating system, CentOS7. To request computers that are running CentOS7, add this line to your submit file:

requirements = (OpSysMajorVer == 7)

C. Using Your Desired Software

There are two ways to run software inside a Docker job:

  1. Use a script or executable program already inside the container.
  2. Use a script that you transfer into the container, using software installed in the container
Instructions for each of these use cases is below.

1. Using an executable already inside the container

List the full path to this executable in your submit file.

executable = /usr/bin/R

2. Using your own script

Give the name of your script as the "executable", just as you would for a normal job submission. Unlike in many of our guides, this script doesn't need to be a "wrapper" script written in a language like bash; instead, it can use something like Python or R directly from inside the container.

executable = run_tensorflow.py