Useful Options for condor_q
condor_q command can be used for much more than
just checking on whether your jobs are running or not! Read on
to learn how you can use
condor_q to answer many
common questions about running jobs.
- View all of your jobs (old condor_q output).
- View jobs from all users.
- Determine why jobs are on hold.
- Find out where jobs are running.
- View jobs by DAG.
- View all details about a job.
- View specific details about a job using auto-format.
- View only specific types of jobs using a constraint
1. Default condor_q output
As of July 19, 2016, the default
condor_q output will show
a single user's jobs, grouped in "batches", as shown below:
OWNER BATCH_NAME SUBMITTED DONE RUN IDLE HOLD TOTAL JOB_IDS
alice CMD: sb 6/22 13:05 _ 32 _ _ _ 14297940.23-99
alice DAG: 14306351 6/22 13:47 27 113 65 _ 205 14306411.0 ...
alice CMD: job.sh 6/22 13:56 _ _ 12 _ _ 14308195.6-58
alice DAG: 14361197 6/22 16:04 995 1 _ _ 1000 14367836.0
HTCondor will automatically group jobs into "batches" for this display.
However, it's also possible for you
to specify groups of jobs as a "batch" yourself. You can either:
Either option will create a batch of jobs with the label "CoolJobs".
2. View all jobs.
To display more detailed condor_q output (where each job is
listed on a separate line), you can use the batch name or any
existing grouping constraint (
ClusterId or other "-constraint"
options - see below for more
on constraints) and the
Looking at a batch of jobs with the same
ClusterId would look like this:
[alice@submit]$ condor_q -nobatch 195
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
195.10 alice 6/22 13:00 0+00:00:00 H 0 0.0 job.sh
195.14 alice 6/22 13:00 0+00:01:44 R 0 0.0 job.sh
195.16 alice 6/22 13:00 0+00:00:26 R 0 0.0 job.sh
195.39 alice 6/22 13:00 0+00:00:05 R 0 0.0 job.sh
195.40 alice 6/22 13:00 0+00:00:00 I 0 0.0 job.sh
195.41 alice 6/22 13:00 0+00:00:00 I 0 0.0 job.sh
195.53 alice 6/22 13:00 0+00:00:00 I 0 0.0 job.sh
195.57 alice 6/22 13:00 0+00:00:00 I 0 0.0 job.sh
195.58 alice 6/22 13:00 0+00:00:00 I 0 0.0 job.sh
9 jobs; 0 completed, 0 removed, 5 idle, 3 running, 1 held, 0 suspended
Other 'nobatch' options, for a DAG or batch name would look like this:
[alice@submit]$ condor_q -nobatch -dag 123457
[alice@submit]$ condor_q -nobatch -constraint 'JobBatchName == "mybatchname"'
This was the default view for
condor_q from January 2016
until July 2016.
3. View jobs from all users.
condor_q will just show you information about
your jobs. To get information about all jobs in the queue, type:
[alice@submit]$ condor_q -all
This will show a list of all job batches in the queue.
To see a list of all jobs (individually, not in batches)
for all users, combine the
-nobatch options with
condor_q. This was the
default view for
condor_q before January 2016.
4. Determine why jobs are on hold.
If your jobs have gone on hold, you can see the hold reason by
[alice@submit]$ condor_q -hold
[alice@submit]$ condor_q -hold JobId
The first will show you the hold reasons for all of your jobs that
are on hold; the second will show you the hold reason for a specific
job. If you aren't sure what your hold reason means, see our
or email firstname.lastname@example.org.
5. Find out where jobs are running.
To see which computers your jobs are running on, use:
[alice@submit]$ condor_q -nobatch -run
428.0 alice 6/22 17:27 0+00:07:17 email@example.com
428.1 alice 6/22 17:27 0+00:07:11 firstname.lastname@example.org
428.2 alice 6/22 17:27 0+00:07:16 email@example.com
428.3 alice 6/22 17:27 0+00:07:16 firstname.lastname@example.org
428.5 alice 6/22 17:27 0+00:07:16 email@example.com
428.7 alice 6/22 17:27 0+00:07:16 firstname.lastname@example.org
428.8 alice 6/22 17:27 0+00:07:16 email@example.com
6. View jobs by DAG.
If you have submitted multiple DAGs to the queue, it can be hard to
tell which jobs belong to which DAG. The
-dag option to
condor_q will sort your queue output by DAG:
[alice@submit]$ condor_q -nobatch -dag
ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD
460.0 alice 11/18 16:51 0+00:00:17 R 0 0.3 condor_dagman -p 0
462.0 |-0 11/18 16:51 0+00:00:00 I 0 0.0 print.sh
463.0 |-1 11/18 16:51 0+00:00:00 I 0 0.0 print.sh
464.0 |-2 11/18 16:51 0+00:00:00 I 0 0.0 print.sh
461.0 alice 11/18 16:51 0+00:00:09 R 0 0.3 condor_dagman -p 0
465.0 |-0 11/18 16:51 0+00:00:00 I 0 0.0 print.sh
466.0 |-1 11/18 16:51 0+00:00:00 I 0 0.0 print.sh
467.0 |-2 11/18 16:51 0+00:00:00 I 0 0.0 print.sh
8 jobs; 0 completed, 0 removed, 6 idle, 2 running, 0 held, 0 suspended
7. View all details about a job.
Each job you submit has a series of attributes that are tracked
by HTCondor. You can see the full set of attributes for a single
job by using the "long" option for
condor_q like so:
[alice@submit]$ condor_q -l JobId
Iwd = "/home/alice/analysis/39909"
JobPrio = 0
RequestCpus = 1
JobStatus = 1
ClusterId = 19997268
JobUniverse = 5
RequestDisk = 10485760
RequestMemory = 4096
DAGManJobId = 19448402
Attributes that are often useful for checking on jobs are:
Iwd: the job's submission directory on the submit node
UserLog: the log file for a job
RequestMemory, RequestDisk: how much memory and disk you've requested per job
MemoryUsage: how much memory the job has used so far
JobStatus: numerical code indicating whether a job is idle, running, or held
HoldReason: why a job is on hold
DAGManJobId: for jobs managed by a DAG, this is the JobId of the parent DAG
8. View specific details about a job using auto-format
If you would like to see specific attributes (see above) for a job or group of
jobs, you can use
the "auto-format" (
condor_q which will print out
only the attributes you name for a single job or group of jobs.
For example, if I would like to see the amount of memory and disk I've
requested for all of my jobs, and how much memory is currently behing used,
I can run:
[alice@submit]$ condor_q -af RequestMemory RequestDisk MemoryUsage
1 325 undefined
1 325 undefined
2000 1000 245
2000 1000 220
2000 1000 245
9. Constraining the output of condor_q.
If you would like to find jobs that meet certain conditions, you can use
condor_q's "constraint" option. For example, suppose you want
to find all of the jobs associated with the DAGMan Job ID "234567". You
can search using:
[alice@submit]$ condor_q -constraint "DAGManJobId == 234567"
To use a name (for example, a batch name) as a constraint, you'll
need to use multiple sets of quotation marks:
[alice@submit]$ condor_q -constraint 'JobBatchName == "MyJobs"'
One common use of constraints is to find all
jobs that are running, held, or idle. To do this, use
a constraint with the
JobStatus attribute and the appropriate
status number - the status codes can be found in
A of the HTCondor Manual.
condor_q -hold from before?
In the background, the
-hold option is constraining the list of jobs
to jobs that are on hold (using the
JobStatus attribute) and then
printing out the
HoldReason attribute. Try running:
[alice@submit]$ condor_q -constraint "JobStatus == 5" -af ClusterId ProcId HoldReason
You should see something very similar to running