# How to run genome-wide association studies (GWAS)

### How do I run my analysis?

For all of the GWAS analysis, the recommended way to run them is using the **Unmodifiable pipelines** in the Pipelines app. This allows for the results to be exported to green library without a request for export.

For most analysis, the unmodifiable pipeline should be sufficient. However, there are some cases where the unmodifiable pipeline might not be the correct approach. For example:

* You do NOT want your results in the green library
* You need to modify the pipeline more than just changing analysis type (binary/quantitative, additive/dominant/recessive) or phenotype or covariates
* Your usecase requires the modification of the pipeline itself, for example preprocessing or postprocessing or giving custom data/arguments to the analysis software

If any of these apply, the **Modifiable pipelines** are the approach you should take. You can find the modifiable pipelines in the Pipelines app, and run them either from there or from finngen-cli.

## Which software to use?

There are many possibilities and programs to perform [GWAS](/background-reading/gwas-analysis.md) on your phenotype of interest in Sandbox, depending on your needs. Currently (from DF7 and DF9 onwards), the core and [custom](/working-in-the-sandbox/which-tools-are-available/untitled.md) GWAS analyses are performed using [REGENIE](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-regenie.md), and unless you specifically needs to use some other tool, REGENIE is generally recommended.

Here is a flowchart to help you to choose software to use in your case:

![](/files/W87Bsy8jOi2o0Bq2F8FT)

## Logistic or linear model?

It is crucial to your analysis results that you choose the correct model (logistic/linear) for your analysis. This depends on your phenotype: for *binary* phenotypes, use a *logistic* model, and for *continuous* phenotypes use a *linear* model.

![](/files/7YJYKYzNAi5Cjw8gjWC0)

All software (except GATE, which performs survival modelling) in Sandbox that is designed to perform GWAS (REGENIE, SAIGE and plink2) can perform both logistic and linear models, but the way of defining the type of your model/phenotype differs across programs. Please see detailed instructions on how to define this in [REGENIE](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-regenie.md#logistic-or-linear), [SAIGE](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-saige.md#logistic-or-linear) or [plink2](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-plink2-for-unrelated-individuals-only.md) based on what you plan on using.

**Note!** The easiest way to conduct a GWAS is to use the [custom GWAS tools](/working-in-the-sandbox/which-tools-are-available/untitled.md). From the Sandbox v10.2 onwards Custom GWAS CLI is available for both [binary](/working-in-the-sandbox/which-tools-are-available/untitled/custom-gwas-command-line-cli-tool/custom-gwas-cli-binary-mode.md) and [quantitative](/working-in-the-sandbox/which-tools-are-available/untitled/custom-gwas-command-line-cli-tool/custom-gwas-cli-quantitative-mode.md) phenotypes, using REGENIE pipeline. In addition to additive model, also recessive and dominant analysis are available in Custom GWAS CLI.

## REGENIE or SAIGE?

REGENIE and SAIGE both perform logistic and linear mixed models (meaning related individuals can be included in the analysis). For binary traits, both use saddlepoint approximation (SPA) to calibrate unbalanced case-control ratios. Therefore, REGENIE is basically an improved SAIGE, with two major advantages:

1\) it is *faster* than SAIGE and

2\) when working with binary traits, the Firth correction used in REGENIE provides *much* more reasonable effect-size estimates and standard errors when the minor allele count is low, compared to SAIGE.

For FinnGen releases 1-6, the core GWAS were performed using SAIGE. Therefore, if you want to run similar GWAS as in those releases for your phenotypes, please use [SAIGE](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-saige.md). Otherwise, it is recommended that you use [REGENIE](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-regenie.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
