# Pipelines tool

## Introduction

To handle the multiple steps required for FinnGen analyses, from data preparation to final output, we created streamlined analysis pipelines. By combining validation, parallel processing, and output generation into one workflow, we can easily manage complex GWAS, fine-mapping, and genetic correlation analyses across thousands of endpoints. We have made these workflows available to all users to make performing these analyses faster and easier. These analysis workflows are available in the **Pipelines tool**.

Analyses submitted in the Pipelines tool are run in Google Cloud and not on your local virtual machine and thus allows for larger and more parallelized analyses.

## Accessing the Pipelines tool

To open the Pipelines tool, simply open up the sandbox "start menu" by clicking **Applications** (usually in the top-left corner of the screen). From that menu, click **Sandbox** top open a second menu and then click **Pipelines**. This will open the Pipelines page in the Firefox web browser.

<figure><img src="/files/Le08jybDhCCwi73EdXhA" alt="A screenshot showing the location of the Pipelines tool in the sandbox Applications menu"><figcaption></figcaption></figure>

## Navigating the Pipelines tool

On the Pipelines tool front page, you will see three different options:

* **FinnGen workflows**, where you can access and submit pre-written workflows for the most common analyses performed in sandbox.
  * <mark style="color:green;">Green workflows</mark>: Also known as "unmodifiable" workflows. Here you can access a list of pre-built analysis workflows with no customization allowed and limited input options. On completion, results are automatically uploaded to the green library and the user-results PheWeb.
  * <mark style="color:red;">Red workflows</mark>: Also known as "modifiable" workflows. Here you can access a list of pre-built analysis workflows that allow you to also edit the workflow (.wdl file) itself. As a consequence, the results are not automatically made available in the green library.
  * <mark style="color:$info;">Edit draft</mark>: Here you can access draft workflows that you have saved from previous Pipelines sessions.
* **User-defined workflows**, where you can submit your own custom-designed workflows. This option is for more advanced use, as it requires you to know how to create code in [Workflow Descriptor Language](https://openwdl.org/) (WDL).
  * <mark style="color:red;">Create your own workflow</mark>: This allows you to write your own WDL code and input options (.json file) for a fully customizable analysis that can be run in the cloud.
  * <mark style="color:$info;">Edit draft</mark>: Click here to access draft workflows that you have saved.
* **Submitted jobs**, where you can see the status of analyses you have submitted through the Pipelines tool.
  * <mark style="color:$primary;">Show pipelines jobs</mark>: See a list of all jobs that you have run or are running, regardless of their status. You can also change the filters so that you can see analyses submitted by other users within the same sandbox.
  * <mark style="color:$primary;">Running</mark>: See a list of your jobs that are currently running
  * <mark style="color:$primary;">Succeeded</mark>: See a list of your jobs that have successfully completed
  * <mark style="color:$primary;">Failed</mark>: See a list of your failed jobs

<figure><img src="/files/HYRcDNcRb9BuAzZVvWIf" alt="A screenshot showing the different options available in the Pipelines tool"><figcaption></figcaption></figure>

## Using the pipelines tool

We have a separate page dedicated to instructions on using the pipelines tool to submit analyses and check their status. Please see [How to use the Pipelines tool](/working-in-the-sandbox/running-analyses-in-sandbox/pipelines-tool-instructions/how-to-use-the-pipelines-area.md).

## Pipelines are written in WDL and managed using Cromwell

In FinnGen, we have opted to write pipelines in [Workflow Descriptor Language](https://openwdl.org/) (WDL), an open-source standard for "describing data processing workflows with a human-readable and writeable syntax" and is commonly used in bioinformatics. WDL separates scientific logic from infrastructure, allowing complex tasks to be parallelized and run both on HPC platforms and in cloud environments. To interpret the WDL code, we use Cromwell (a workflow management system) to handle the computational side of analyses, such as creating virtual machines in the cloud, logging of workflows and sending and receiving data to and from cloud storage. See [Pipelines is based on Cromwell and WDL](/working-in-the-sandbox/running-analyses-in-sandbox/pipelines-tool-instructions/pipelines-is-based-on-cromwell-and-wdl.md) for more information about Cromwell.

## Additional Information

See following documentations about Pipelines in FinnGen Sandbox:

* [Sandbox training / Pipeline module](https://www.finngen.fi/en/members/recordings/finngen-sandbox-training-pipeline-module-19th-march-2019)
* [FinnGen Users' Meeting 9th February 2021](https://www.finngen.fi/en/members/recordings/finngen-data-users-meeting-9th-feb-2021)
* [FinnGen Users' Meeting 6th April 2021](https://www.finngen.fi/en/members/recordings/users-meeting-recording-6th-apr-2021)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/pipelines.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
