Pipelines tool
Information about the Sandbox Pipelines tool
Introduction
To handle the multiple steps required for FinnGen analyses, from data preparation to final output, we created streamlined analysis pipelines. By combining validation, parallel processing, and output generation into one workflow, we can easily manage complex GWAS, fine-mapping, and genetic correlation analyses across thousands of endpoints. We have made these workflows available to all users to make performing these analyses faster and easier. These analysis workflows are available in the Pipelines tool.
Analyses submitted in the Pipelines tool are run in Google Cloud and not on your local virtual machine and thus allows for larger and more parallelized analyses.
Accessing the Pipelines tool
To open the Pipelines tool, simply open up the sandbox "start menu" by clicking Applications (usually in the top-left corner of the screen). From that menu, click Sandbox top open a second menu and then click Pipelines. This will open the Pipelines page in the Firefox web browser.

Navigating the Pipelines tool
On the Pipelines tool front page, you will see three different options:
FinnGen workflows, where you can access and submit pre-written workflows for the most common analyses performed in sandbox.
Green workflows: Also known as "unmodifiable" workflows. Here you can access a list of pre-built analysis workflows with no customization allowed and limited input options. On completion, results are automatically uploaded to the green library and the user-results PheWeb.
Red workflows: Also known as "modifiable" workflows. Here you can access a list of pre-built analysis workflows that allow you to also edit the workflow (.wdl file) itself. As a consequence, the results are not automatically made available in the green library.
Edit draft: Here you can access draft workflows that you have saved from previous Pipelines sessions.
User-defined workflows, where you can submit your own custom-designed workflows. This option is for more advanced use, as it requires you to know how to create code in Workflow Descriptor Language (WDL).
Create your own workflow: This allows you to write your own WDL code and input options (.json file) for a fully customizable analysis that can be run in the cloud.
Edit draft: Click here to access draft workflows that you have saved.
Submitted jobs, where you can see the status of analyses you have submitted through the Pipelines tool.
Show pipelines jobs: See a list of all jobs that you have run or are running, regardless of their status. You can also change the filters so that you can see analyses submitted by other users within the same sandbox.
Running: See a list of your jobs that are currently running
Succeeded: See a list of your jobs that have successfully completed
Failed: See a list of your failed jobs

Using the pipelines tool
We have a separate page dedicated to instructions on using the pipelines tool to submit analyses and check their status. Please see How to use the Pipelines tool.
Pipelines are written in WDL and managed using Cromwell
In FinnGen, we have opted to write pipelines in Workflow Descriptor Language (WDL), an open-source standard for "describing data processing workflows with a human-readable and writeable syntax" and is commonly used in bioinformatics. WDL separates scientific logic from infrastructure, allowing complex tasks to be parallelized and run both on HPC platforms and in cloud environments. To interpret the WDL code, we use Cromwell (a workflow management system) to handle the computational side of analyses, such as creating virtual machines in the cloud, logging of workflows and sending and receiving data to and from cloud storage. See Pipelines is based on Cromwell and WDL for more information about Cromwell.
Additional Information
See following documentations about Pipelines in FinnGen Sandbox:
Last updated
Was this helpful?