HPC High Performance Computing: 4.5. Submitting interactive jobs

Submitting interactive jobs

Sometimes users need more computational power to run commands from the command line. As a general rule, it is NOT a good idea to launch these commands from the login node, because the login node is a server designed to manage user sessions, it has limited resources and it does not behave well with CPU extensive sessions. Additionally, intensive CPU usage on the login node could reduce session speed and performance for the rest of the users. 

To solve this need, users can request an interactive job on a computing node with the "interactive" command.

test@login01:~$ interactive -p high -J prova_doc
salloc: Granted job allocation 76153
test@node021:~$ 

    JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
     
    76153   high      prova_do  test R        0:19    1     node021

These are the options of interactive command: 

 -A: account (non-default account)
 -p: partition (default: )
 -a: architecture (default: , values hsw=Haswell skl=SkyLake wsw=Warsaw)
 -n: number of tasks (default: 1)
 -c: number of CPU cores (default: 1)
 -m: amount of memory (GB) per core (default: 1 [GB])
 -e: email address to which the begin session notification is to be sent
 -r: specify a reservation name
 -w: target node
 -J: job name
 -x: binary that you want to run interactively

 

There's another command that is similar to interactive and is "salloc"

test@login01:~/parallel$ salloc --partition=high --nodes=2 --time=00:30:00
salloc: Granted job allocation 818
test@node009:~$ source /etc/profile.d/lmod.sh
test@node009:~$ source /etc/profile.d/easybuild.sh
test@node009:~$ module av
---------------------------------------- /soft/easybuild/lmod/lmod/modulefiles/Core --------------------------


test@node005:~/parallel$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
               818      high       sh  test  R       0:05      2 node[005-006]
test@node005:~/parallel$ exit
exit
salloc: Relinquishing job allocation 818

 

If we ask for a resource that is not available at that time, slurm will indicate that it is waiting for free resources

test@node009:~$ salloc --partition=high --nodes=2 --time=00:30:00 --gres=gpu:1
salloc: Pending job allocation 76162
salloc: job 76162 queued and waiting for resources

 

Another way to create an interactive sesion is using "srun" 

test@node009:~# srun --nodes=1 --gres=gpu:1 --pty bash -i 
test@node024:~# squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
             21271      high GPU_burn  test PD       0:00      1 (Resources)

 

IMPORTANT INFORMATION: 

if you use salloc command you must run: 

source /etc/profile.d/lmod.sh

source /etc/profile.d/easybuild.sh