The basic command to send a job is sbatch.
test@login01:~$ sbatch sleep.sh Submitted batch job 175
Once registered, the scheduler tells the job ID. Keep it handy, in case of problems you'll need this value to debug what happens.
Although you can specify the options when calling sbatch, when submitting a job execution request it is strongly advised that all options be provided within a job definition file (in these examples the file will be called “job.sh"). This file will contain the command you wish to execute and SLURM resource request options that you need.
#!/bin/bash #SBATCH -J prova_uname10 #SBATCH -p short #SBATCH -N 1 #SBATCH -n 2 #SBATCH --chdir=/homedtic/test/slurm_jobs #SBATCH --time=2:00 #SBATCH -o %N.%J.out # STDOUT #SBATCH -e %N.%j.err # STDERR uname -a >> /homedtic/test/uname.txt
The "-J" option sets the name of the job. This name is used to create the output log files for the job. We recommend using a capital letter for the job name is order to distinguish these log files from the other files in your working directory. This makes it easier to delete the log files later.
The "-p" option requests the queue in which the job should run.
The "-N" option Request that a minimum of nodes be allocated to this job.
The "-n" option Request the number of tasks per node.
The “–time” option specify a a limit on the total run time of the job allocation.
The “–chdir” option Set the working directory of the batch script to directory before it is executed.
The "-o" option instruct Slurm to connect the batch script's standard output directly to the file name specified in the "filename pattern".
The "-e" option instruct Slurm to connect the batch script's standard error directly to the file name specified in the "filename pattern".
We can monitor how our job is doing with the scontrol show job command:
test@login01:~$ scontrol show job 173 JobId=173 JobName=prova_uname10 UserId=test(1039) GroupId=info_users(10376) MCS_label=N/A Priority=6501 Nice=0 Account=info QOS=normal JobState=PENDING Reason=Resources Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=08:00:00 TimeMin=N/A SubmitTime=2017-11-27T16:37:47 EligibleTime=2017-11-27T16:37:47 StartTime=2017-11-27T16:50:21 EndTime=Unknown Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=medium AllocNode:Sid=node009:5743 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) SchedNodeList=node[004,020] NumNodes=2-2 NumCPUs=4 NumTasks=4 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=4,mem=4096,node=2 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryCPU=1024M MinTmpDiskNode=0 Features=(null) Gres=(null) Reservation=(null) OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/homedtic/test/sleep.sh WorkDir=/homedtic/test StdErr=/homedtic/test/slurm.%N.173.err StdIn=/dev/null StdOut=/homedtic/test/slurm.%N.173.out Power=
The scontrol show job output shows the task ID of our job (jobId), priority (priority) who has launched it (userId), which is the state of the job (JobState) , when and where it has been registered (submitTime and partition), and how many slots it's been using (slots).
Each queue has specific policies that enforce how long jobs are allowed to execute: short* queues allow up to 2 hours, medium* queue allow up to 8 hours, and high* queues have no runtime limit. When you submit a job, you are either implicitly or explicitly indicating how long the job is expected to run. The first way to indicate the maximum runtime using the format
#!/bin/bash #SBATCH -J prova_uname10 #SBATCH -p short #SBATCH --time=2:00 ps -ef | grep slurm srun -n8 uname -a >> $HOME/uname.txt
Here we have an example of time limit. This script will stop when it reaches 2 minutes of execution. If the requested time limit exceeds the partition's time limit, the job will be left in a PENDING state (possibly indefinitely).
By default, if not specified otherwise, the scheduler will redirect the output of anuy job you launch to a couple of files, placed on your actual directory called <slurm-jobid.out> After a few executions and test, probably your $HOME will look like this:
As a general rule, you are advised to use the following flags to redirect the input and error files:
Flag | Request | Comment |
---|---|---|
-e <path>/<filename> | Redirect error file | The system will create the given file on the path specified and will redirect the job's error file here. If name is not specified, default name will apply. |
-o <path>/<filename> | Redirect output file | The system will create the given file on the path specified and will redirect the job's output file here. If name is not specified, default name will apply. |
–workdir | Change error and output to the working directory | The output file and the error file will be placed in the directory from which 'sbatch' is called. |
Of course, we can place this option on our job definition file:
#!/bin/bash #SBATCH -J prova_dani_uname10 #SBATCH -p short #SBATCH -N 1 #SBATCH -n 2 # number of cores #SBATCH --chdir=/homedtic/test/slurm_jobs #SBATCH --time=2:00 #SBATCH -o slurm.%N.%J.%u.out # STDOUT #SBATCH -e slurm.%N.%J.%u.err # STDERR ps -ef | grep slurm
And when launching the job, we'll see the output files created at /homedtic/test/slurm_jobs. If no error is reported, an empty file will be created.
test@node009:~/slurm_jobs$ ls slurm.node005.206.test.err slurm.node005.206.test.out test@node009:~/slurm_jobs$ test@node009:~/slurm_jobs$ test@node009:~/slurm_jobs$ cat slurm.node005.206.test.out root 1138 1 0 Nov24 ? 00:00:00 /usr/sbin/slurmd root 6714 1 0 20:04 ? 00:00:00 slurmstepd: [206] test 6719 6714 0 20:04 ? 00:00:00 /bin/bash /var/spool/slurmd/job00206/slurm_script test 6724 6719 0 20:04 ? 00:00:00 grep slurm test@node009:~/slurm_jobs$
We can modify the requeriments of a job while it's waiting to be processed. Once it's
Flag | Request | Comment |
---|---|---|
scancel <job_id> | Delete the job | The system will remove the job and all its dependencies from the queues and the execution hosts |
Below I show a table of job reasons
JOB REASON CODES |
|
AssociationJobLimit |
The job's association has reached its maximum job count. |
AssociationResourceLimit |
The job's association has reached some resource limit. |
AssociationTimeLimit |
The job's association has reached its time limit. |
BadConstraints |
The job's constraints can not be satisfied. |
BeginTime |
The job's earliest start time has not yet been reached. |
BlockFreeAction |
An IBM BlueGene block is being freed and can not allow more jobs to start. |
BlockMaxError |
An IBM BlueGene block has too many cnodes in error state to allow more jobs to start. |
Cleaning |
The job is being requeued and still cleaning up from its previous execution. |
Dependency |
This job is waiting for a dependent job to complete. |
FrontEndDown |
No front end node is available to execute this job. |
InactiveLimit |
The job reached the system InactiveLimit. |
InvalidAccount |
The job's account is invalid. |
InvalidQOS |
The job's QOS is invalid. |
JobHeldAdmin |
The job is held by a system administrator. |
JobHeldUser |
The job is held by the user. |
JobLaunchFailure |
The job could not be launched. This may be due to a file system problem, invalid program name, etc. |
Licenses |
The job is waiting for a license. |
NodeDown |
A node required by the job is down. |
NonZeroExitCode |
The job terminated with a non-zero exit code. |
PartitionDown |
The partition required by this job is in a DOWN state. |
PartitionInactive |
The partition required by this job is in an Inactive state and not able to start jobs. |
PartitionNodeLimit |
The number of nodes required by this job is outside of it's partitions current limits. Can also indicate |
PartitionTimeLimit |
The job's time limit exceeds it's partition's current time limit. |
Priority |
One or more higher priority jobs exist for this partition or advanced reservation. |
Prolog |
It's PrologSlurmctld program is still running. |
QOSJobLimit |
The job's QOS has reached its maximum job count. |
QOSResourceLimit |
The job's QOS has reached some resource limit. |
QOSTimeLimit |
The job's QOS has reached its time limit. |
ReqNodeNotAvail |
Some node specifically required by the job is not currently available. The node may currently be in use, reserved for another job, in an advanced reservation, DOWN, DRAINED, or not responding. Nodes which are DOWN, DRAINED, or not responding will be identified as part of the job's "reason" field as "UnavailableNodes". Such nodes will typically require the intervention of a system administrator to make available. |
Reservation |
The job is waiting its advanced reservation to become available. |
Resources |
The job is waiting for resources to become available. |
SystemFailure |
Failure of the Slurm system, a file system, the network, etc. |
TimeLimit |
The job exhausted its time limit. |
QOSUsageThreshold |
Required QOS threshold has been breached. |
WaitingForScheduling |
No reason has been set for this job yet. Waiting for the scheduler to determine the appropriate reason. |