Rucio

About Rucio

Rucio is a scientific data management system used in LHC particle physics and related scientific domains. It provides access for large volumes of data spread across facilities at multiple institutions.

If your workflow needs to access some researched data managed by a Rucio instance, you can use the Rucio authentication technique described below.

Dependencies

Currently, the Rucio authentication technique relies on the VOMS proxy authentication, meaning that you must set up your VO user certificates and proxy. Please see the VOMS proxy documentation page for more information.

Uploading secrets

In order to create the Rucio configuration for your workflow jobs, you will have to upload your VOMS proxy secrets as well as your Rucio username, for example:

$ reana-client secrets-add --env VONAME=atlas \
                           --env VOMSPROXY_FILE=x509up_u1000 \
                           --file /tmp/x509up_u1000 \
                           --env RUCIO_USERNAME=johndoe

The above configuration will automatically connect the user to a Rucio host based on the VONAME value. You can also override the default Rucio host detection and connect to any Rucio instance by providing two more environment variables:

$ reana-client secrets-add --env RUCIO_RUCIO_HOST=https://myrucio-server.example.org \
                           --env RUCIO_AUTH_HOST=https://myrucio-auth.example.org

Configuring your workflows

You are now ready to declare that some steps of your workflow need to access Rucio data. This can be achieved by setting the workflow hints voms_proxy and rucio for those steps. The examples below show how to specify this hint for your CWL, Serial, Snakemake and Yadage workflows.

CWL workflow example:

steps:
  first:
    hints:
      reana:
        voms_proxy: true
        rucio: true
    run: rucio get my_rucio_scope:my_rucio_file

Serial workflow example:

workflow:
  type: serial
  specification:
    steps:
      - environment: docker.io/reanahub/reana-auth-rucio:1.1.1
        voms_proxy: true
        rucio: true
        commands:
          - rucio get my_rucio_scope:my_rucio_file

Snakemake example:

rule mystep:
  container:
    "docker://docker.io/reanahub/reana-auth-rucio:1.1.1"
  resources:
    voms_proxy=True,
    rucio=True
  shell:
    "rucio get my_rucio_scope:my_rucio_file"

Yadage example:

step:
  process:
    process_type: "string-interpolated-cmd"
    cmd: 'rucio get my_rucio_scope:my_rucio_file'
  publisher:
    publisher_type: "frompar-pub"
    outputmap:
      outputfile: outputfile
  environment:
    environment_type: "docker-encapsulated"
    image: "docker.io/reanahub/reana-auth-rucio"
    imagetag: "1.1.1"
    resources:
      - voms_proxy: true
      - rucio: true

The voms_proxy and rucio workflow hints are fully sufficient to instruct REANA to set up the Rucio configuration for your jobs. You don't have to modify the logic of your workflow steps in any other way besides providing the above one-line workflow hint declarations.

Creating your job environment images

In the above examples, we have used the reana-auth-rucio:1.1.1 as an example of a job environment container image that can be used at runtime to access some Rucio-managed data files. When REANA will orchestrate the execution of this job, it will automatically create a sidecar container that will perform the necessary Rucio configuration beforehand, using the secrets you uploaded. The environment image of your choice must simply contain the rucio-clients package installed.