Tracking jobs in Slurm
Here is some useful code for tracking current and historical jobs on a High Performance Computing system.
The Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management, or simply Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world’s supercomputers and computer clusters.
I use Slurm on a daily basis to check the status of my submitted tasks, and to view historical jobs.
Viewing Current Tasks
To view current tasks, I use the following code:
squeue -u $USER -o "%.18i %.9P %.35j %.8u %.8T %.10M %.10l %.6D %20R %80Z"
This will display the current task, including:
- Job ID
- Partition
- Name
- User
- State
- Time
- Time Limit
- Nodes
- Nodelist and Reason
- Working Directory
Viewing Historical Tasks
To view historical tasks, use the following code:
sacct -u $USER \
--starttime=$(date -d "3 months ago" +%Y-%m-%d) \
--format=JobID,JobName%30,Partition,AllocCPUS,State,ExitCode,Elapsed,Start,End \
-X
This will display the following information for all jobs over the past 3 months:
- Job ID
- Job Name
- Partition
- Number of Allocated CPUs
- State
- Exit Code
- Time Elapsed
- Start Time
- End Time
If a longer timespan is required, adjust the starttime field.
Comments
No comments have yet been submitted. Be the first!