# Running analyses in your IVM vs. Pipelines

Sandbox Pipelines are used for large-scale analysis and enable the use of parallelization and custom-sized virtual machines. There is several ready pipelines templates (wdl code and example input json) for common analysis in Human Genomics on FinnGen data. These templates are divided into green pipelines and red pipelines. It is recommended to use green pipelines these are curated by FinnGen analysis team and automatically export summary level data to library green bucket.

### Overview

* It is recommended to use ready <mark style="color:$success;">green workflows</mark> available in pipelines page.
* Green pipelines user cannot edit the wdl code but can change input json and the results are automatically exported to green library and user can download them without separate download request.
* From the library green the green pipeline GWAS results are automatically loaded to [Pheweb browser.](https://results.finngen.fi/)
* The main reason why green pipeline failed:
  * User input data in the json (files/strings ect) does not match specific format expected by the workflow code, you can check the expected format from successful job.
  * Phenolist and phenofile does not match, ie phenolist contains phenotypes that are not availalbe in phenofile.
* User should be cautios when using <mark style="color:red;">Red pipelines</mark> these are templates for custom workflows, user can freely edit the wdl code and input json.
* The red pipeline templates are not updated regulary and it is expected that user is able to edit/fix wdl code and input data.
* The output data is not automatically exported and can be found inside Sandbox /finngen/pipeline folder.

Resons to use Pipelines for data processing instead of Interactive Virtual Machine (IVM) with FinnGen data:

* Ready analysis workflos
* Using the scatter function the user can call (call section in the wdl) as many VMs as are available on the Google cloud.
* The VM size can be customized in runtime settings in the wdl
* It is possible to submit multiple pipeline runs simultaneously
* Workflows are encoded using workflow definition language (wdl)
* it is possible to run calls in a parallel or serial manner
* Tasks in each call are defined in the task section of the wdl
* Pipeline jobs are always batch jobs so the entire pipeline must be coded in a single workflow
* Pipelines cannot be used interactively

### Differences

Although there are many differences in pipeline and IVM usage the **underlying commands** to do analysis **are essentially the same.**

In pipeline the commands are just translated from workflow language to bash or another coding language.

Input localization and delocalization to the VM called by wdl language are encoded with special variable type “File”.

Details about wdl language are available at:

[Terra support / WDL Documentation](https://support.terra.bio/hc/en-us/sections/360007274612-WDL-Documentation)

You can launch the Pipelines from the Sandbox menu Applications>FinnGen>Pipelines or via the command line.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/quirks-and-features/difference-between-running-things-in-your-ivm-vs.-pipelines.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
