# Analysis covariates

This page has been last updated for R13.

### Sandbox directory

Analysis covariates are available in the following Sandbox directory:

`/finngen/library-red/finngen_R[RELEASE]/analysis_covariates`

### Data files

The analysis covariate file is a tab-separated, gzip-compressed text file that contains covariate and endpoint data for each sample. The file contains three sets of columns:

* column 1: Sample ID
* columns 2 to N: covariates including principal components, \~200 columns for R13
* columns N+1 to N+1+number of endpoints: individual's phenotype status for each FinnGen endpoint

The covariate file does not contain FinnGen genotypes for individuals with non-Finnish ancestry. For more complete phenotype data see [the phenotype files](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1.md).

Often users will subset this file in R to run their own analyses and/or add additional analysis columns.

#### Some column descriptions:

| **Column name**                        | **Description**                                                                                                      |
| -------------------------------------- | -------------------------------------------------------------------------------------------------------------------- |
| FINNGENID                              | Sample ID                                                                                                            |
| AGE\_AT\_DEATH\_OR\_END\_OF\_FOLLOWUP  | Age of sample at death or end of followup                                                                            |
| batch                                  | batch                                                                                                                |
| n\_var                                 | Number of genotyped variants                                                                                         |
| chip                                   | Chip used for genotyping                                                                                             |
| IS\_AFFY                               | Whether the sample was genotyped using Affymetrix chip                                                               |
| IS\_FINNGEN1\_CHIP                     | Whether the sample was genotyped using Finngen v1 chip                                                               |
| IS\_FINNGEN2\_CHIP                     | Whether the sample was genotyped using Finngen v2 chip                                                               |
| IS\_AFFY\_\*                           | Whether the chip genotypes were called using the specified version of the calling algorithm                          |
| AGE\_AT\_DEATH\_OR\_END\_OF\_FOLLOWUP2 | AGE\_AT\_DEATH\_OR\_FOLLOWUP\*AGE\_AT\_DEATH\_OR\_FOLLOWUP                                                           |
| BATCH\*                                | Whether the sample was part of that genotyping batch. Can be used to control for batch-specific effects in analysis. |
| PC\*                                   | Individual's PCA value for that component                                                                            |
| \*\_IRN                                | Inverse rank-normalized quantitative endpoints                                                                       |

For other columns, refer to the [minimum extended phenotype](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/minumum-extended-phenotype-data.md) and the [endpoint data](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/endpoint-and-endpoint-longitudinal-data.md) pages.

### Further information

The covariate file is used for GWAS and other analyses. The following covariates are used in FinnGen's core GWAS analyses:

* Age
* Sex
* First 10 principal components
* Genotyping batch (Finngen 1 or 2 chip and legacy genotyping batch)

**Note**: This file is usually released a little later than the phenotype files as it needs the PCA results to be created.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/finngen-data-specifics/red-library-data-individual-level-data/genotype-data/types-of-genotype-files-available/covariate-file.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
