Using GPU within Nextflow

Luca Cozzuto
3 min readOct 4, 2019

--

Nextflow is a powerful workflow manager for scientific computational pipelines (https://www.nextflow.io/) written in Groovy language. It is based on the concepts of dataflow in which single blocks of codes aka “Processes” are linked together with “Channels”, i.e. queues of data. In brief, you can consider a whole pipeline as a directed acyclic graph where processes are edges and channels arrows:

Example of a DAG representing a pipeline execution

Nextflow allows the transparent use of Linux containers, such as Singularity and Docker, to achieve pipeline reproducibility in different environments. It also allows a clear separation between the resources needed /requested from the command execution. You can define a number of configurations with predefined resources in a separate file called nextflow.config and assign them to given processes with a directive.

The name of each configuration is defined with withLabel selector. In the example below we have:

  • the name of the container for every process
  • the resources needed for the processes marked with the “with_cpus” label.

This configuration can be assigned to one or more processes defined in the Nextflow pipeline by using the directive label. Here an example of a minimal process in which we define:

  • the input channel
  • a block code to be executed
  • the expected output

If you want to make a process able to be run with either CPUs or GPUs, you can specify a parameter which presence will assign different labels to that process. If you want to know more about GPU and containers @lpryszcz wrote a nice post about it here. Here the implementation:

The label for GPU computing should then address for the different container engine used. Both Docker and Singularity can read the local GPU-drivers, so you don’t need to pack them within the image, but you need to specify some custom parameters:

For Singularity: --nvFor Docker: --gpus all 

Nextflow can be launched indicating the container engine to be used at runtime in the following way:

nextflow run main.nf -with-docker

or

nextflow run main.nf -with-singularity

If you have your docker image in Dockerhub, this will be automatically downloaded for you in each node that will be used. If you choose singularity, Nextflow will convert the docker image into a singularity one and stored locally.

At this point we want to capture the information about the container used and choose the parameter accordingly. This can be done using the workflow property “containerEngine”.

Here an example of the “with_gpus” label that takes into account the parameters for the different images:

Finally you can launch your pipeline in this way:

nextflow run main.nf --GPU ON -with-docker

or

nextflow run main.nf --GPU ON -with-singularity

I hope this can be useful to people that are struggling with the adaptation of Nextflow pipelines to GPU use!

--

--

Luca Cozzuto
Luca Cozzuto

Written by Luca Cozzuto

Biotechnologist and bioinformatician. Spain, Italy.

No responses yet