Kubernetes

REANA supports Kubernetes as a primary and default job execution backend alongside HTCondor and Slurm.

If step does not contain compute_backend specification, it will be executed on the default backend.

  # Serial example
  ...
  steps:
    - name: reana_demo_helloworld_htcondorcern
      environment: 'docker.io/library/python:2.7-slim'
      compute_backend: kubernetes
      commands:
        - python "${helloworld}"

Custom memory limit

When jobs exceed the cluster memory limits, you will get out of memory (OOM) error and job gets OOMKilled. To avoid OOM error, you can increase the memory limits per workflow step to make an efficient use of the available resources. Note that you can also decrease the memory limit to increment the chances of jobs being scheduled earlier if the cluster is busy.

To set the memory limit of a job you can specify kubernetes_memory_limit in the specification of each workflow step.

Read more about the expected memory values on Kubernetes official documentation

You can configure the steps in the respective specifications for Serial, Yadage, CWL and Snakemake workflow engines.

For Serial, you can set kubernetes_memory_limit in every step of workflow specification:

  ...
  steps:
    - name: reana_demo_helloworld_memory_limit
      environment: 'docker.io/library/python:2.7-slim'
      compute_backend: kubernetes
      kubernetes_memory_limit: '8Gi'
      commands:
        - python helloworld.py

For Yadage, you can set kubernetes_memory_limit in every step under environment.resources:

  ...
  stages:
    - name: reana_demo_helloworld_memory_limit
      dependencies: [init]
      scheduler:
        scheduler_type: 'singlestep-stage'
        parameters:
          helloworld: {step: init, output: helloworld}
        step:
          process:
            process_type: 'string-interpolated-cmd'
            cmd: 'python "{helloworld}"'
          environment:
            environment_type: 'docker-encapsulated'
            image: 'docker.io/library/python'
            imagetag: '2.7-slim'
            resources:
              - compute_backend: kubernetes
              - kubernetes_memory_limit: '8Gi'

For CWL, you can set kubernetes_memory_limit in every step under hints.reana:

  ...
  steps:
    first:
      hints:
        reana:
          compute_backend: kubernetes
          kubernetes_memory_limit: '8Gi'
      run: helloworld_memory_limit.tool
      in:
        helloworld: helloworld_memory_limit
      out: [result]

For Snakemake, you can set kubernetes_memory_limit in every rule under resources:

  ...
  rule helloworld:
    input:
        helloworld=config["helloworld"],
        inputfile=config["inputfile"],
    params:
        sleeptime=config["sleeptime"]
    output:
        "results/greetings.txt"
    resources:
        compute_backend="kubernetes",
        kubernetes_memory_limit="8Gi"
    container:
        "docker://docker.io/library/python:2.7-slim"
  ...

Custom job timeouts

When a job exceeds the specified time limit, it will be terminated by Kubernetes and marked as failed.

To set the job timeout, you can declare kubernetes_job_timeout in the specification of each workflow step. Time is measured in seconds.

Read more about the job run time deadline limits in the Kubernetes official documentation.

You can configure the steps in the respective specifications for Serial, Yadage, CWL and Snakemake workflow engines.

For Serial, you can set kubernetes_job_timeout in every step of workflow specification:

  ...
  steps:
    - name: reana_demo_helloworld_job_timeout
      environment: 'docker.io/library/python:2.7-slim'
      compute_backend: kubernetes
      kubernetes_job_timeout: 60
      commands:
        - python helloworld.py

For Yadage, you can set kubernetes_job_timeout in every step under environment.resources:

  ...
  stages:
    - name: reana_demo_helloworld_job_timeout
      dependencies: [init]
      scheduler:
        scheduler_type: 'singlestep-stage'
        parameters:
          helloworld: {step: init, output: helloworld}
        step:
          process:
            process_type: 'string-interpolated-cmd'
            cmd: 'python "{helloworld}"'
          environment:
            environment_type: 'docker-encapsulated'
            image: 'docker.io/library/python'
            imagetag: '2.7-slim'
            resources:
              - compute_backend: kubernetes
              - kubernetes_job_timeout: 60

For CWL, you can set kubernetes_job_timeout in every step under hints.reana:

  ...
  steps:
    first:
      hints:
        reana:
          compute_backend: kubernetes
          kubernetes_job_timeout: 60
      run: helloworld_job_timeout.tool
      in:
        helloworld: helloworld_job_timeout
      out: [result]

For Snakemake, you can set kubernetes_job_timeout in every rule under resources:

  ...
  rule helloworld:
    input:
        helloworld=config["helloworld"],
        inputfile=config["inputfile"],
    params:
        sleeptime=config["sleeptime"]
    output:
        "results/greetings.txt"
    resources:
        compute_backend="kubernetes",
        kubernetes_job_timeout=60
    container:
        "docker://docker.io/library/python:2.7-slim"
  ...