HPC High Performance Computing: 9.3. Running Docker Containers Using GPU

interactive use

If we want to run docker containers that use GPU's three conditions must be met:

1. -  reserve the GPU resources needed for our container to the queue manager

2. - use an image that is compatible with the use of GPU's

3. - load socker module

To do that in an interactive sesion:

First of all we have to run interactive command from the login node to be sent to a calculation node. For example, if we need two GPU's of an intel node we must specify the architecture and the number of GPU's with "-g" parameter: 

test@login01:~$ interactive -a intel -g 2
salloc: Pending job allocation 459683
salloc: job 459683 queued and waiting for resources
salloc: job 459683 has been allocated resources
salloc: Granted job allocation 459683
test@node031:~$

 

In the other hand, if we need an AMD node with 1 gpu  we will have to run:

test@node032:~$ interactive -a amd -g 1
salloc: Pending job allocation 459629
salloc: job 459629 queued and waiting for resources
salloc: job 459629 has been allocated resources
salloc: Granted job allocation 459629
test@node025:~$

 

Next step is to load socker module:

test@node028:~$ module load socker
test@node028:~$

 

Then we can run commands using the image:

test@node028:~$ module load socker
test@node028:~$ socker run nvidia/cuda:8.0-runtime-ubuntu14.04 nvidia-smi
Tue Jan 29 18:49:28 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN X (Pascal)    On   | 00000000:04:00.0 Off |                  N/A |
| 25%   45C    P8    17W / 250W |      2MiB / 12189MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
test@node028:~$

sbatch use

In sbatch mode we have to prepare a script that request GPU resoruces (is the same than interactive mode but in this case specified in a script)

#!/bin/bash
#SBATCH -J prova13
#SBATCH -p high
#SBATCH -N 1
#SBATCH --gres=gpu:1
#SBATCH -C amd
#SBATCH --workdir=/homedtic/dvaldes/Scripts/gpu_maxwell
#SBATCH -o slurm.%N.%J.%u.out # STDOUT
#SBATCH -e slurm.%N.%J.%u.err # STDERR

module load CUDA/8.0.61
module load socker
socker run nvidia/cuda:8.0-runtime-ubuntu14.04 nvidia-smi
                                                                            

We run the command sbatch:


test@login01:~/Scripts/gpu_maxwell$ sbatch gpu_maxwell2.sh
Submitted batch job 454235
test@login01:~/Scripts/gpu_maxwell$ tail -f slurm.node028.454235.dvaldes.out
Tue Jan 29 19:04:46 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN X (Pascal)    On   | 00000000:04:00.0 Off |                  N/A |
| 25%   45C    P8    17W / 250W |      2MiB / 12189MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

 

and we will find the same results than interactive mode