# Minimum extended phenotype data

This page has been last updated for R12.

{% hint style="info" %}
The minimum extended phenotype data file was introduced in Data Release 11. It contains data previously released in three separate files: [minimum phenotype data](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/minimum-and-minimum-longitudinal-data.md), [cohort data](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/cohort-data.md) and baseline data file.
{% endhint %}

### Sandbox directory

The minimum extended phenotype data file is available as a separate file in the following Sandbox directory:

`/finngen/library-red/finngen_R[RELEASE]/phenotype_1.0/`

### Data files

This data is available in the following file:

`data/finngen_R{RELEASE]_minimum_extended_1.0.txt.gz`

The samples in the file are in the same order as in the genotype data files. The file contains the following columns:

{% hint style="info" %}
APPROX\_BIRTH\_DATE was first released in FinnGen data release 11. BMI, CURRENT\_SMOKER and EVER\_SMOKER were first released in FinnGen data release 12.
{% endhint %}

| **Column**            | **Description**                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| FINNGENID             | Sample ID                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| BL\_YEAR              | Year of DNA sample collection                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| BL\_AGE               | Age at DNA sample collection (years)                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| SEX                   | Gender (male/female/NA)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| HEIGHT                | Height (cm)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| HEIGHT\_AGE           | Age at height measurement (years)                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| WEIGHT                | Weight (kg)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| WEIGHT\_AGE           | Age at weight measurement (years)                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| BMI                   | Body mass index                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| SMOKE2                | Smoking status 2-categories (yes/no)                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| SMOKE3                | Smoking status 3-categories (current/former/never)                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| SMOKE5                | Smoking status 5-categories (current/occasional/quitter/former/never)                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| SMOKE\_AGE            | Age at the moment of the smoking survey (years)                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| CURRENT\_SMOKER       | cases: SMOKE3 variable category="current", controls: SMOKE3 variable category="never"                                                                                                                                                                                                                                                                                                                                                                                                                  |
| EVER\_SMOKER          | cases: SMOKE3 variable category="current"/"former", controls: SMOKE3 variable category="never"                                                                                                                                                                                                                                                                                                                                                                                                         |
| regionofbirth         | <p>Regional councils numbers for region of birth according to Finnish Minister of the Interior (21-categories)</p><p>From <a href="https://dvv.fi/en/individuals">Digital and Population Data Services Agency (DVV)</a></p>                                                                                                                                                                                                                                                                            |
| regionofbirthname     | <p>Name of the region of birth (21-categories) (1-Uusimaa 2- Varsinais-Suomi 4-Satakunta 5-Kanta Häme 6-Pirkanmaa 7-Päijät Häme 8-Kymenlaakso 9-South Karelia 10-Etelä Savo 11-Pohjois Savo 12-North Karelia 13-Central Finland 14-South Ostrobothnia 15-Ostrobothnia 16-Central Ostrobothnia 17-North Ostrobothnia 18-Kainuu 19-Lapland 20-Åland 200-Abroad 9999-Region ceded to Soviet)</p><p>From <a href="https://dvv.fi/en/individuals">Digital and Population Data Services Agency (DVV)</a></p> |
| moveabroad            | <p>If the person has moved abroad 3-categories (yes/no/NA)</p><p>From <a href="https://dvv.fi/en/individuals">Digital and Population Data Services Agency (DVV)</a></p>                                                                                                                                                                                                                                                                                                                                |
| NUMBER\_OF\_OFFSPRING | <p>Number of biological children</p><p>From <a href="https://dvv.fi/en/individuals">Digital and Population Data Services Agency (DVV)</a></p>                                                                                                                                                                                                                                                                                                                                                          |
| COHORT                | Biobank collection name (for THL, it contains the THL cohorts as well)                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| FU\_END\_AGE          | Age at the end of the follow up; Age at the time when register follow-up ends in registers; age of death if individual has died, age of age of emigration if a person has moved abroad.                                                                                                                                                                                                                                                                                                                |
| DEATH                 | Death; 1=death by the end of death registry, 0=alive at the end of death registry.                                                                                                                                                                                                                                                                                                                                                                                                                     |
| DEATH\_AGE            | Age at death; Age of death if individual has died, age of age of emigration if a person has moved abroad, or age at the time when register follow-up ends in registers.                                                                                                                                                                                                                                                                                                                                |
| APPROX\_DEATH\_DATE   | Year of death                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| APPROX\_BIRTH\_DATE   | Randomized birth day (within +/- 1-15 days)                                                                                                                                                                                                                                                                                                                                                                                                                                                            |

\*In the DF12 minimum extended data: AGE\_AT\_DEATH\_OR\_END\_OF\_FOLLOW\_UP and DEATH\_FU\_AGE columns replaced column FU\_END\_AGE containing the same information.

If biobanks do not know the exact DNA sample collection date, they have been instructed to estimate it using age and the birth date extracted from the Finnish personal identity code. If the sample collection day is missing, the date is estimated to be `15.mm.yyyy`. If the month is missing, the sample collection date is estimated to be `30.06.yyyy`. In some cases, the sample collection date is not available and is impossible to estimate reliably.

Biobanks are instructed to report dates for height, weight, and smoking status. These dates are compared against calculated values from the DNA sample collection date and the birth date extracted from the Finnish personal identity code. If values differ, clarification is asked from the biobank.

Biobanks have been asked to provide all available information about smoking. Some biobanks send information if an individual is a current smoker, while some provide more detailed smoking information.

Many other quality checks are performed as well. For example, a BMI needs to be between 10-80 kg/m2 when checking height and weight, and dates cannot be from the future. The sex reported by biobanks is compared against the personal identity code. If values differ, clarification is asked from the biobank.

### Further information

[Extracting minimum phenotype data by biobank](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/minumum-extended-phenotype-data/extraction-of-finngen-minimum-data-set-information-per-biobank.md)

[DNA isolation protocols by biobank](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/minumum-extended-phenotype-data/dna-isolation-protocols-per-biobank.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/minumum-extended-phenotype-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
