Slurm

Slurm is a specialized workload management system for high performance computing jobs and it is supported by REANA alongside primary job execution backend Kubernetes and HTCondor.

Authentication

In order to use the CERN Slurm cluster for your REANA jobs, you first need to have access to the CERN Slurm Linux HPC resource. Please find more information about how to get access there.

The authentication between REANA cluster and CERN Slurm compute backend happens on the basis of Kerberos authentication. Please generate your keytab and upload your secrets.

Specifying compute backend

You can decide which steps of your workflows will be run on Kubernetes and which steps will be dispatched to the CERN Slurm cluster by setting the compute_backend workflow hint accordingly. Please use slurmcern if you would like to dispatch some jobs to the CERN Slurm compute backend. You can consult the Examples section below about how to do this for your CWL, Serial, Snakemake, Yadage workflows.

Specifying environment

The Slurm jobs will run in a containerised compute environment that uses Singularity container technology. There are three possibilities how you can specify the desired computing environment for your job:

  • If you are using a Docker container image in your workflow step, REANA will automatically convert the image into a Singularity SIF image before submitting the job. This is fully transparent.
  • You can also specify your own Singularity SIF image. You have to upload it into your workspace before starting the workflow.
  • You can also use Singularity images from the CVMFS unpacked images area.

Please see the Examples section below that will provide concrete examples for each of these techniques.

Specifying Slurm parameters

If you would like to specify a concrete Slurm partition to use or a concrete Slurm timeout limit, you can provide additional workflow hints called slurm_partition (default is inf-short) and slurm_time (default is 60 minutes). The available Slurm partition values are listed in CERN Linux HPC resources documentation page. Please see the Examples section below for some concrete examples.

Examples

The following CWL workflow specification will dispatch the first gendata step of the RooFit demo example to the CERN Slurm compute backend, using a regular Docker image for the job environment:

  ...
  steps:
    gendata:
      hints:
        reana:
          compute_backend: slurmcern
      run: gendata.cwl
      in:
        gendata_tool: gendata_tool
        events: events
      out: [data]

The following Serial workflow specification will dispatch the first gendata step of the RooFit demo example to the CERN Slurm compute backend, using a custom Singularity image called myimage_1_0.sif from the workspace:

  ...
  inputs:
    files:
      ...
      - myimage_1_0.sif
  ...
  steps:
    - name: gendata
      environment: 'myimage_1_0.sif'
      compute_backend: slurmcern
      commands:
        - mkdir -p results && root -b -q 'code/gendata.C(${events},"${data}")'

The following Snakemake workflow specification will dispatch the first gendata step of the RooFit demo example to the CERN Slurm compute backend, using a particular Singularity image from CVMFS unpacked image area:

  ...
  rule gendata:
      input:
          gendata_tool=config["gendata"]
      output:
          "results/data.root"
      params:
          events=config["events"]
      container:
          "/cvmfs/unpacked.cern.ch/registry.hub.docker.com/rootproject/root:6.26.00-ubuntu20.04"
      resources:
          compute_backend="slurmcern"
      shell:
          "mkdir -p results && root -b -q '{input.gendata_tool}({params.events},\"{output}\")'"

The following Yadage workflow specification will dispatch the first gendata step of the RooFit demo example to the CERN Slurm compute backend, using a custom Slurm partition called photon and a custom Slurm timeout of five minutes:

  ...
  stages:
    - name: gendata
      dependencies: [init]
      scheduler:
        ...
        step:
          ...
          environment:
            environment_type: 'docker-encapsulated'
            image: 'docker.io/reanahub/reana-env-root6'
            imagetag: '6.18.04'
            resources:
              - compute_backend: slurmcern
              - slurm_partition: 'photon'
              - slurm_time: '5'