HPC High Performance Computing: 9.2. Running Docker containers on Slurm

Process to deploy an image.

Introduction

Next section shows the process to run a container from an existing image, the software used to do this task is a wrapper of docker that fit the container inside the resources allocated by the current slurm job. By default Slurm creates cgroups for the jobs and deletes them automatically.

The main tips of the wrapper:

  • Enforce the execution of containers as non-root users.
  • Fit the container in the Linux Control Group partition assigned by the Slurm job. 

Important information before hands-on:

  • this Software is under experimental use, please report if any issues arise.
  • Batch a Interactive sessions.
  • X-Sessions aren't possible.
  • you can't write outside of HOME or /datasets.
  • container and images are destroyed after every execution.
  • images must be authorized by IT support, request the access opening a ticket.

1. Use of socker command line

Socker software is installed as a module and it must be loaded (remember that module is only available on compute nodes):

module load socker

socker have many options to interacte with containers, most important are:

    -v, --verbose
        run in verbose mode
    images
        List the authorized Docker images (found in socker-images)
    run IMAGE COMMAND
        start a container from IMAGE executing COMMAND as the user
    runti IMAGE SHELL
        start an interactive container from IMAGE executing SHELL as the user

1. List authorized images

nodeXXX $ socker images
debian:latest
centos:latest
ubuntu:latest
bvlc/caffe:gpu
bvlc/caffe:cpu
nvidia/cuda:latest
registry.sb.upf.edu/info/centos:7

2. batch mode

Run a CentOS container and print the system release

$ socker run <docker-image> <command>
$ socker run centos:latest cat /etc/system-release

3. interactive mode

$ socker runti <docker-image> <command>
$ socker runti centos:latest bash

4. Enabling verbose mode for steps 2 or 3

$ socker -v run <docker-image> <command>
$ socker -v run centos:latest date

2. [Example] Prepare the environment

Example of program that container can run, in this case a bash date will be executed in a compute node:

login01 $ vi my_script.sh 

#!/bin/bash
date

3. [Example] batch task

Write a Slurm file accordance with the needs of the job to deploy a CentOS container:

login01 $ vi Scripts/04-slurm_docker_date.sh 

#!/bin/bash

#SBATCH --job-name="date-container"
#SBATCH -n 1 # Number of cores
#SBATCH -N 1 # Ensure that all cores are on one machine
#SBATCH -t 0-01:00 # Runtime: 1 hour
#SBATCH -p high # Partition to submit to
#SBATCH -o logs/%x-%j.out # File to which STDOUT will be written
#SBATCH -e logs/%x-%j.err # File to which STDERR will be written

module load socker
socker run centos:latest "<path_to_my_file>/my_script.sh"

As usual submit the task:

login01 $ sbatch Scripts/04-slurm_docker_date.sh
Read the log file with the results:
login01 $ cat logs/date-container-101902.*

Fri Oct 26 10:42:09 CEST 2018

4. [Example] interactive task

Request an interactive session to log into a compute node:
login01 $ interactive
Load socker module:
nodeXXX $ module load socker

Two options:

1) Batch container in an interactive slurm job:

nodeXXX $ socker run centos:latest "./my_script.sh"

Fri Oct 26 12:32:39 CEST 2018
2) Interactive session in an interactive slurm job (Please discard Warning! message):
nodeXXX $ socker runti centos:latest bash

I have no name!@a001ec57ed4a:~$ ./my_script.sh 
Fri Oct 26 12:38:34 CEST 2018