Reviewing Job Information Using SLURM
The HPC Cluster uses SLURM to manage jobs on the HPC Cluster. This page describes how to review job performance and monitor other job information.
Table of Contents
The following assumes that you have been granted access to the HPC cluster 
and can log into the head node spark-login.chtc.wisc.edu. If this is not
the case, please see the CHTC account application page or email
the facilitation team at chtc@cs.wisc.edu.
View Job Performance with seff
The seff command will print out a summary of usage and efficiency metrics for 
a specific job. The usage and output looks like this:
[alice@login]$ seff 79950
Job ID: 79950
Cluster: spark_el9
User/Group: alice/alice
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 16
CPU Utilized: 00:00:08
CPU Efficiency: 1.61% of 00:08:16 core-walltime
Job Wall-clock time: 00:00:31
Memory Utilized: 876.69 MB (estimated maximum)
Memory Efficiency: 2.74% of 31.25 GB (1.95 GB/core)
View Job Information with sacct
SLURM saves jobs information in a database that can be queried using the sacct command.
If you are having trouble viewing output from
saccttry running this command first[alice@login]$ sacct --start=2018-01-01
How To Select Jobs
By default sacct shows only your jobs, that ran or were submitted on the current 
date. See the following list for different ways to select groups of jobs to review. Some of the options – especially the time and user options – can both be added to the same query.
- To display information about a specific job or list of jobs use -jor--jobsfollowed by a job number or comma separated list of job numbers.
[alice@login]$ sacct --jobs job1,job2,job3
- To select information about jobs in a certain date range use --startand--endWithout it,sacctwill only return jobs from the current day.
[alice@login]$ sacct --start=YYYY-MM-DD
- To select information about jobs in a certain time range use --starttimeand--endtimeThe default start time is 00:00:00 of the current day, unless used with-j, then the default start time is Unix Epoch 0. The default end time is time of running the command. Valid time formats areHH:MM[:SS] [AM|PM] MMDD[YY] or MM/DD[/YY] or MM.DD[.YY] MM/DD[/YY]-HH:MM[:SS] YYYY-MM-DD[THH:MM[:SS]]
[alice@login]$ sacct --starttime 08/23 --endtime 08/24
- To display another user’s jobs use --user
[alice@login]$ sacct --user BuckyBadger
Displaying Specific Fields
sacct can display different fields about your jobs. You can use the --helpformat flag to get a full list.
[alice@login]$ sacct --helpformat
Once you know what fields to display, the format flag will allow you to list the ones you want to see:
[alice@login]$ sacct --format=JobId,Partition,NCpus,NNodes,State,Elapsed
Recommended Fields
When looking for information about your jobs CHTC recommends using these fields
elapsed
end
exitcode
jobid
ncpus
nnodes
nodelist
ntasks
partition
start
state
submit
user
Other Useful Options
To only show statistics relevant to the job allocation itself, not taking steps into consideration, use -X. This can be useful when trying to figure out which part of a job errored out.
[alice@login]$ sacct -X
A Sample sacct Query
For example to view all of your jobs since January 1, 2024, printing out which partition you used, how many nodes, and what the final status of the job was, use:
[alice@login]$ sacct -X --start=2024-01-01 --format=jobid,partition,nnodes,state