# Adding new covariates in GWAS using REGENIE and SAIGE

**An example of how to make covariate + phenotype file in R for GWAS run using** [**REGENIE**](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-regenie.md) **and** [**SAIGE**](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-saige.md)

```
# read covariate file from library-red to R
library(data.table)
cov_pheno = fread("path/finngen_R8_cov_1.0.txt.gz")
# if FinnGen ID is in your list of cases mark it as 1 other rows as 0 
cov_pheno$CASES = is.element(cov_pheno$FID, cases$FINNGENID)*1
```

If you are using all samples in the covariate file, then this is enough and your cov\_pheno file is ready.

If you are not using all samples in the covariate file you can use the following code to include controls and `NA` values.

```
# if FinnGen ID is in a list of controls mark it as 1 other rows to 0
cov_pheno$CONTROLS = is.element(cov_pheno$FID, controls$FINNGENID)*1

# set 1 for cases, 0 for controls and NA for the rest
cov_pheno$ASTHMA = ifelse(cov_pheno$CASES == 1, 1, ifelse(cov_pheno$CONTROLS == 1, 0, NA))

# Check that things have gone as expected. For instance, you may have a slightly smaller number
# of cases/controls if some samples have phenotype data but genotype data has not passed QC
sum(cov_pheno$CASES)
sum(cov_pheno$CONTROLS)

# Remove CASES and CONTROLS columns
cov_pheno = cov_pheno[, -which(names(cov_pheno) %in% c("CASES", "CONTROLS"))]
```

Once your file is ready, save your covariate + phenotype file to your folder in `home/ivm`

```
write.table(PhenoFile, file=gzfile("/home/ivm/folder_name/cov_pheno_forASTHMA.txt.gz"),
sep= "\t", quote= FALSE, row.names= FALSE, col.names=TRUE, na="NA")
```

Pipelines read files in the "red" bucket. In order to make cov-pheno file available for a [REGENIE](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-regenie.md) or [SAIGE](/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/how-to-run-gwas-using-saige.md) pipeline copy the covariate + phenotype file to /finngen/red/ following instructions in [Sharing with your organization](/working-in-the-sandbox/quirks-and-features/sharing-individual-level-data-within-the-sandbox.md#sharing-within-your-organization).

See an example script available in the green library. Path to example file in the Sandbox:

```
/finngen/library-green/scripts/code_snippets/Add_CustomPheno_to_COV.R
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-genome-wide-association-studies-gwas/adding-new-covariates-in-gwas-using-regenie-and-saige.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
