Mixing Nextflow Executors for Hybrid Workflows

hpc
sdsc
nextflow
Author

Andrea Zonca

Published

November 15, 2025

This post demonstrates how to combine different Nextflow executors within a single workflow, a powerful pattern for optimizing resource usage in scientific computing. By running lightweight tasks locally and offloading computationally intensive processes to HPC clusters like Expanse (as discussed in my previous tutorial on Nextflow on Expanse), users can accelerate development and ensure scalability. This tutorial provides a minimal example of a Nextflow pipeline leveraging both local and Slurm executors.

1. Project Layout

expanse_nextflow/
├── greetings.csv
├── main_local_slurm.nf
├── nextflow.config
└── scripts/

greetings.csv is a CSV file whose first column holds greetings. main_local_slurm.nf defines three processes:

  • sayHello writes each greeting to its own file (local executor).
  • convertToUpper converts each file to uppercase (Slurm executor).
  • collectGreetings concatenates the uppercase files (local executor).

2. Key Nextflow Concepts

  • Profiles (in nextflow.config) are global bundles of settings activated via -profile. They affect the entire run.
  • Executors describe how a process gets executed (local, slurm, awsbatch, …). You can override the executor per process with the executor directive inside the process block.
  • Process directives such as queue, cpus, memory, clusterOptions, and beforeScript can also be set per process.

This approach keeps everything inside the workflow file so individual processes specify exactly how they should run—no profile juggling required.

3. Walk Through main_local_slurm.nf

process sayHello {
    publishDir 'results', mode: 'copy'
    executor 'local'
    input:
        val greeting
    output:
        path "${greeting}-output.txt"
    script:
    """
    echo '$greeting' > '$greeting-output.txt'
    """
}

Runs directly on the login node; no Slurm queue involved.

process convertToUpper {
    publishDir 'results', mode: 'copy'
    container 'oras://ghcr.io/mkandes/ubuntu:22.04-amd64'
    executor 'slurm'
    queue 'debug'
    time '00:10:00'
    cpus 2
    memory '4 GB'
    clusterOptions '--account=sds166 --nodes=1 --ntasks=1 -o logs/%x-%j.out -e logs/%x-%j.err'
    scratch true
    beforeScript '''
      set -euo pipefail
      export SCRATCH_DIR="/scratch/$USER/job_${SLURM_JOB_ID}"
      mkdir -p "$SCRATCH_DIR/tmp"
      export TMPDIR="$SCRATCH_DIR/tmp"
      export NXF_OPTS="-Djava.io.tmpdir=$TMPDIR"
      echo "[NF] Using node-local scratch at: $SCRATCH_DIR"
    '''
    input:
        path input_file
    output:
        path "UPPER-${input_file}"
    script:
    """
    cat '$input_file' | tr '[a-z]' '[A-Z]' > 'UPPER-${input_file}'
    """
}

Runs on Slurm with the same queue/limits/scratch prep the slurm_debug profile used, but scoped only to this process.

process collectGreetings {
    publishDir 'results', mode: 'copy'
    executor 'local'
    input:
        path input_files
        val batch_name
    output:
        path "COLLECTED-${batch_name}-output.txt" , emit: outfile
        val count_greetings , emit: count
    script:
        count_greetings = input_files.size()
    """
    cat ${input_files} > 'COLLECTED-${batch_name}-output.txt'
    """
}

Back to the local executor for the final aggregation and summary.

4. Running the Workflow

  1. Place greetings.csv in the project root (first column only is fine).

  2. Launch the workflow. Note that there is no need to use a profile (e.g., -profile slurm_debug) because the executor for each process is configured separately within the workflow itself. This approach is particularly useful for orchestrating jobs across different execution environments, such as mixing AWS cloud resources with an HPC cluster.

    nextflow run main_local_slurm.nf -ansi-log false
  3. You should observe output similar to this:

    N E X T F L O W  ~  version 25.10.0
    Launching `main_local_slurm.nf` [disturbed_babbage] DSL2 - revision: 719cae3431
    [81/795962] Submitted process > sayHello (2)
    [1a/95d170] Submitted process > sayHello (1)
    [82/03b4e9] Submitted process > sayHello (3)
    [b2/9ef9d9] Submitted process > convertToUpper (1)
    [be/37f3e7] Submitted process > convertToUpper (2)
    [58/871526] Submitted process > convertToUpper (3)
    [56/d2737f] Submitted process > collectGreetings
    There were 3 greetings in this batch
  4. To specifically observe which processes are submitted to Slurm and which run locally, you can inspect the .nextflow.log file. For instance, using grep -i submitted .nextflow.log will show:

    grep -i submitted .nextflow.log
    Nov-15 10:09:05.950 [Task submitter] INFO  nextflow.Session - [81/795962] Submitted process > sayHello (2)
    Nov-15 10:09:05.993 [Task submitter] INFO  nextflow.Session - [1a/95d170] Submitted process > sayHello (1)
    Nov-15 10:09:06.133 [Task submitter] INFO  nextflow.Session - [82/03b4e9] Submitted process > sayHello (3)
    Nov-15 10:09:06.393 [Task submitter] DEBUG nextflow.executor.GridTaskHandler - [SLURM] submitted process convertToUpper (1) > jobId: 44200682; workDir: /home/zonca/p/software/expanse_nextflow/work/b2/9ef9d9a6d062048fd8915225ca2609
    Nov-15 10:09:06.393 [Task submitter] INFO  nextflow.Session - [b2/9ef9d9] Submitted process > convertToUpper (1)
    Nov-15 10:09:06.660 [Task submitter] DEBUG nextflow.executor.GridTaskHandler - [SLURM] submitted process convertToUpper (2) > jobId: 44200683; workDir: /home/zonca/p/software/expanse_nextflow/work/be/37f3e78a4e3c27f4f63b71da31aa2f
    Nov-15 10:09:06.660 [Task submitter] INFO  nextflow.Session - [be/37f3e7] Submitted process > convertToUpper (2)
    Nov-15 10:09:06.759 [Task submitter] DEBUG nextflow.executor.GridTaskHandler - [SLURM] submitted process convertToUpper (3) > jobId: 44200684; workDir: /home/zonca/p/software/expanse_nextflow/work/58/871526bef14c129bae9f85c07edd15
    Nov-15 10:09:06.759 [Task submitter] INFO  nextflow.Session - [58/871526] Submitted process > convertToUpper (3)
    Nov-15 10:11:20.782 [Task submitter] INFO  nextflow.Session - [56/d2737f] Submitted process > collectGreetings
    Nov-15 10:11:25.676 [main] DEBUG n.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=7; failedCount=0; ignoredCount=0; cachedCount=0; pendingCount=0; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=1m 10s; failedDuration=0ms; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=2; peakCpus=4; peakMemory=8 GB; ]

    This log clearly shows that sayHello processes are submitted without a specific executor (implying local execution), while convertToUpper processes explicitly indicate [SLURM] submitted process.

  5. Finally, observe the behavior:

    • sayHello and collectGreetings execute on the local executor (e.g., login node).
    • convertToUpper submits to Slurm (executor > slurm) with the queue, resources, and clusterOptions defined directly in its process block. You can check logs/<process>-<jobid>.{out,err} for scheduler output.

5. Tips for More Complex Pipelines

  • Group processes by executor: Keep the per-process directives close to the code that depends on them. Readers immediately understand where work runs.
  • Extract shared directives: If multiple processes share the same Slurm settings, wrap them in a withName: { } block inside nextflow.config or create small helper functions in the script to set environment variables, keeping things DRY without new profiles.
  • Respect login-node limits: Only keep very quick, lightweight tasks on executor 'local'. Anything CPU-intensive should go to the scheduler.
  • Container parity: Use containers (Singularity/Apptainer or Docker) so local and Slurm executions rely on the same software stack.
  • Logging: Route clusterOptions output to a logs/ folder for easier debugging, as done above.

6. Troubleshooting Checklist

  • Processes stuck on Slurm – verify queue/time limits match the partition. Increase time or cpus if the scheduler rejects jobs.
  • Scratch errors locally – ensure you only set scratch true when Slurm allocates node-local storage. Local processes should skip those directives.
  • Environment differences – leverage beforeScript to load modules or set $PATH consistently before each Slurm job starts.

With these practices you can confidently mix execution backends, keeping orchestration simple while still tapping into the scheduler for the heavy lifting.