# How to run GWAS using REGENIE survival models

## What?

Pipeline for running survival GWAS using REGENIE.

## Introduction

REGENIE survival pipeline performs single-variant association tests (GWAS) for **time-to-event endpoints** (ie. survival model). Therefore, it is similar to REGENIE, except that it tests the association of the variant with the *time-to-event,* instead of *having the disease.*

In addition to binary variable (0/1) stating the ***case-control status***, you'll need a column in your phenotype-covariate file to tell the ***survival time*** (NOTE! this information is required for *both* cases and controls: for cases it is the survival time until the event, and for controls you can set it as time until death or end of follow-up \[both age at death and end-of-followup information can be found in each release's phenotype-covariate file.])

**Note:** You need to name the 2 columns (case/control status and survival time) `[pheno]` and `[pheno]_survTime` for the pipeline to recognize them as such. (For example: `Statin_MI` and `Statin_MI_survTime`.)

**Note:** Survival time must be >0 for all individuals

**Note:** Age at end of follow-up or death (AGE\_AT\_DEATH\_OR\_END\_OF\_FOLLOWUP) should **NOT** be used as covariate in survival models. The age at end of followup for controls is inherently already modelled in survival analysis. If you are interested in modelling survival after a certain event, say retinopathy after Type 2 diabetes (T2D) diagnosis, you could add age at T2D diagnosis as a covariate to control for different onset age between individuals.

## Example files for the REGENIE survival pipeline

Example files for running the GATE pipeline in the Sandbox can be found at: `/finngen/library-green/scripts/regenie_surv/`. Files you need from there are:

* `.wdl` file: `regenie_survival.wdl`
* sub-`.wdl` file: `regenie_survival_sub.zip`, and the example input (.json) files:
* `regenie_survival_example.json`

Thes example `.json`- files are for running REGENIE on survival time from first purchase of Statins until I9\_MI\_STRICT endpoint in DF13.

The phenotype-covariate file, phenotype list file and genotype file list needed to run the example (DF6\_exammple) can be found at: `/finngen/shared/regenie_survival_example_files/20250916_064037/files/sruotsal/regenie_survival_examples/`:

* `Statin_MI_example_phenotype_regenie.txt.gz`: example **phenotype-covariate** file
* `Statin_MI_example_phenotypelist.txt`: example **phenotype list** file

To add covariates into your own phenotype file, see for the instructions [here](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/adding-new-covariates-in-gwas-using-regenie-and-saige.md). Note that the phenotype-covariate file requires two columns of FINNGENID, named FID and IID, similar to regular REGENIE pipeline.

## Prepare your files for REGENIE survival pipeline

Inputs for regenie survival pipeline are very much the same as for regular REGENIE pipeline, except you don't define whether your phenotype is binary or not, since this pipeline is for time-to-event phenotypes only. Once you have downloaded the example files, you need to edit the `regenie_survival_example.json` input file. The parts you (may) need to edit in it are:

* `regenie_survival.phenolist`: the path to a .txt file containing the list of phenotype(s) to run, each on their own row. For example, for only one phenotype:\
  `LIBRARY_SHARED/regenie_survival_example_files/20250916_064037/files/sruotsal/regenie_survival_examples/Statin_MI_example_phenotypelist.txt`

```
Statin_MI
```

* `regenie_survival.cov_pheno`: the path to a phenotype-covariate file in gzipped `.txt` format. Example:\
  `LIBRARY_SHARED/regenie_survival_example_files/20250916_064037/files/sruotsal/regenie_survival_examples/Statin_MI_example_phenotype_regenie.txt.gz`
  * **Remember**, you need to name the 2 columns (case/control status and survival time) `[pheno]` and `[pheno]_survTime` for the pipeline to recognize them - in this example, they should read `Statin_MI` and `Statin_MI_survTime`.
* `regenie_survival.covariates`: a list of covariates to be used in the model, separated by `,`.
* `regenie_survival.sub_step1.step1.grm_bed`: the path to the `.bed` file from the GRM file, needs to be edited according to what release you plan on using. In this example, R13 is being used. For your own analyses, we strongly recommend using the most recent (and updated) data release.
* `regenie_survival.sub_step2.bgenlist`: the path to a `.txt` file with a list of `.bgen` (8-bit) files (the genotype files). Needs to be edited according to your release (see above). In this example R13 is used:\
  `LIBRARY_RED/red/finngen_R13/bgen_1.0/finngen_R13_bgen_list.txt`.

## Submit your REGENIE survival job

### Using Pipelines

See [How to use the Pipelines area](/working-in-the-sandbox/running-analyses-in-sandbox/pipelines-tool-instructions/how-to-use-the-pipelines-area.md) to see how to submit your job. Note especially that the pipeline input files are run from the /finngen/red/ folder.

If you need further information on the pipeline/job system, see section [Pipelines is based on Cromwell and WDL](/working-in-the-sandbox/running-analyses-in-sandbox/pipelines-tool-instructions/pipelines-is-based-on-cromwell-and-wdl.md).

Once your `.json` file is ready, you can submit your GATE run via the command:

```
finngen-cli request-workflow --wdl /path/to/regenie_survival.wdl \
    --input /path/to/your_regenie_survival.json \
    --dependencies /path/to/regenie_survival_sub.zip
```

After submitting your job successfully, go to `Applications`->`Sandbox`->`Pipelines` to track your job, and **remember to save your jobs' workflow ID** for tracking and checking the results when your run has finished.

## Submit your job using modifiable workflows

You can also submit your REGENIE survival job using modifiable workflows.

To do that, in sandbox go to: Applications -> Sandbox -> Pipelines -> Modifiable workflow -> Regenie survival DF13 -> Create

Edit `Input JSON` on the bottom of the page accordingly to your files -> `Submit`

## Output

Once your job displays the `Succeeded` state you can see the results similarly as for regular REGENIE pipeline in `/finngen/pipeline/cromwell/workflows/regenie_survival/[WORKFLOW_ID]/`.

You can, for example, find the:

* **summary statistics** (`.gz` and `.regenie.gz`) in `/finngen/pipeline/cromwell/workflows/regenie_survival/[WORKFLOW_ID]/call-sub_step2*/shard*/sub.regenie_step2/*/call-gather/shard*/glob*/*gz` (if you have multiple phenotypes, the results for each phenotype go into their own sub-folders \[`shard-#`]).
* **manhattan and QQ plots** in the `/finngen/pipeline/cromwell/workflows/regenie_survival/[WORKFLOW_ID]/call-sub_step2*/shard*/sub.regenie_step2/*/call-gather/shard*/glob*/*png` .


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-regenie-survival-models.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
