# Detailed longitudinal data

This page has been last updated for R13.

{% hint style="info" %}
[Service sector data](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/service-sector-data.md) contains detailed longitudinal data with additional columns.
{% endhint %}

### Sandbox directory

Detailed longitudinal data is available in the following Sandbox directory:

`/finngen/library-red/finngen_R[RELEASE]/phenotype_1.0/`

**Note, this data also available in Atlas and OMOP common data model.**

### Data files

The data is available in the following file:

`data/finngen_R{RELEASE]_detailed_longitudinal_1.0.txt.gz`

The data file has eleven columns:

<table data-header-hidden><thead><tr><th width="284.0234541577825"></th><th></th></tr></thead><tbody><tr><td><strong>Column</strong></td><td><strong>Description</strong></td></tr><tr><td>FINNGEN ID</td><td>Sample ID</td></tr><tr><td>SOURCE</td><td>Register source</td></tr><tr><td>EVENT_AGE</td><td>Individual's age at the event to two decimals</td></tr><tr><td>APPROX_EVENT_DAY</td><td>A randomized event date: +/- 1-15 days are added to the <a href="/pages/-MhYMPpEhpltxJSdGXVw">confidential</a> exact event date</td></tr><tr><td>CODE1 - CODE4</td><td>Register source specific codes and other information</td></tr><tr><td>ICDVER</td><td>ICD-code version: ICD8/9/10, ICD-O-3</td></tr><tr><td>CATEGORY</td><td>Register code sets (vocabularies)</td></tr><tr><td>INDEX</td><td>Register index number. The same INDEX value within a register means that the codes have been given in the same hospital visit, or are, for example, from the same drug purchase event.</td></tr></tbody></table>

Detailed information about the columns is available in the following file:

`finngen_R[RELEASE]_detailed_longitudinal_readme_1.0.txt`

This is the main register data in FinnGen and contains [health register codes](/finngen-data-specifics/finnish-health-registers-and-medical-coding/international-and-finnish-health-code-sets.md) from different sources. The data is called longitudinal because it contains several events/entries of codes for the same individual recorded at different times.

{% hint style="info" %}
Detailed longitudinal data was presented in the [FinnGen data users meeting on 12th January 2021](https://www.finngen.fi/en/members/recordings/finngen-data-users-meeting-12-jan-2021) and can be explored using [Atlas](/working-in-the-sandbox/which-tools-are-available/atlas/detailed-guide/atlas-data-sources.md) in the Sandbox.
{% endhint %}

The data is used as input for the Endpointter to determine which individuals are included in each of FinnGen's phenotypic endpoints:

![](/files/QyKLGh02P8QWipsl73R0)

A mock example of the detailed longitudinal data file is shown below:

<figure><img src="/files/4KXOX1HgF2Y3xVfdc4R2" alt=""><figcaption></figcaption></figure>

### **Sources**

The detailed longitudinal data file contains codes from the following registries:

<table data-header-hidden><thead><tr><th width="142.33333333333331"></th><th width="302"></th><th></th></tr></thead><tbody><tr><td><strong>Source</strong></td><td><strong>Description</strong></td><td><strong>Types of codes</strong></td></tr><tr><td>PURCH</td><td>Kela drug purchase register</td><td>Medication codes</td></tr><tr><td>REIMB</td><td>Kela drug reimbursement register</td><td>Medication codes</td></tr><tr><td>INPAT</td><td>Inpatient Hilmo register</td><td>Diagnosis codes</td></tr><tr><td>OPER_IN</td><td>Inpatient Hilmo register - operations</td><td>Operation codes</td></tr><tr><td>OUTPAT</td><td>Specialist outpatient Hilmo register</td><td>Diagnosis codes</td></tr><tr><td>OPER_OUT</td><td>Specialist outpatient Hilmo - register operations</td><td>Operation codes</td></tr><tr><td>PRIM_OUT</td><td>Primary health care outpatient visits</td><td>Diagnosis and operation codes</td></tr><tr><td>CANC</td><td>Cancer register</td><td>Cancer codes</td></tr><tr><td>DEATH</td><td>Cause of death register</td><td>Cause of death codes</td></tr></tbody></table>

Image below demonstrates how variables in the national health registries end up to the columns in the detailed longitudinal data.

<figure><img src="/files/yeRrc9QZGw6TYLHb92Ru" alt=""><figcaption><p>On the left: source registers, middle: variables in the source register data, on the right: column names in the detailed longitudinal data.</p></figcaption></figure>

### Codes

Information stored in the CODE1 - CODE4 column depends on the register source:

<table data-header-hidden><thead><tr><th width="144.33333333333331"></th><th width="306.64559585492236"></th><th></th></tr></thead><tbody><tr><td><strong>Source</strong></td><td><strong>Description</strong></td><td><strong>Codes</strong></td></tr><tr><td>PURCH</td><td>Kela drug purchase register</td><td><p>CODE1: ATC code</p><p>CODE2: Kela reimbursment code</p><p>CODE3: Product number</p><p>CODE4: Number of packages</p></td></tr><tr><td>REIMB</td><td>Kela drug reimbursement register</td><td><p>CODE1: Kela reimbursment code</p><p>CODE2: ICD code</p></td></tr><tr><td>INPAT OUTPAT</td><td><p>Inpatient Hilmo register</p><p>Specialist outpatient Hilmo register</p></td><td><p>CODE1: symptom code</p><p>CODE2: cause code (e.g. CODE1 could be <em>dementia associated with Alzheimer’s disease</em> with CODE2 as <em>Alzheimer's disease</em>)</p><p>CODE3: ATC code for drug's adverse effect</p><p>CODE4: duration of stay</p></td></tr><tr><td><p>OPER_IN</p><p>OPER_OUT</p></td><td><p>Inpatient Hilmo register - operations</p><p>Specialist outpatient Hilmo register - operations</p></td><td>CODE1: operation code</td></tr><tr><td>PRIM_OUT</td><td>Primary health care outpatient visits</td><td><p>CODE1: diagnosis or operation code</p><p>CODE2: symptom code</p><p>CODE3: ATC code for drug's adverse effect</p></td></tr><tr><td>CANC</td><td>Cancer register</td><td><p>CODE1: ICD-0-3 topography</p><p>CODE2: ICD-0-3 morphology</p><p>CODE3: ICD-0-3 behaviour</p></td></tr><tr><td>DEATH</td><td>Cause of death register</td><td>CODE1: cause of death</td></tr></tbody></table>

### **Categories**

The register code sets (vocabularies) are stored in the CATEGORY column:

<figure><img src="/files/CtStQCr14pib05AN8gPT" alt=""><figcaption></figcaption></figure>

Detailed information about register code sets is available from:

* [International and Finnish Health Code Sets](/finngen-data-specifics/finnish-health-registers-and-medical-coding/international-and-finnish-health-code-sets.md)
* [More information on health code sets](/finngen-data-specifics/finnish-health-registers-and-medical-coding/more-information-on-health-code-sets.md)

For diagnosis codes:

* 0 at the end of the CATEGORY variable means the main diagnosis code (e.g. 0 in ICD0 and NOM0)
* 1:N at the end of the CATEGORY variable refers to side diagnoses (e.g. 1:N in ICD1:N, NOM1:N)

### Register data availability dates

Data is available from the register start date until the end of the register-specific follow-up date. The follow-up dates are available [here](/finngen-data-specifics/endpoints/complete-follow-up-time-of-the-finngen-registries-primary-endpoint-data.md) and in the following file:

`finngen_R[RELEASE]_detailed_longitudinal_readme_1.0.txt`

The start and follow-up dates differ between registries and FinnGen data releases and you should take this into account in your analyses. Register start and follow-up dates are shown below for Data Freeze 6:

<figure><img src="/files/ofGclY3vzD02lBU5hAFQ" alt=""><figcaption></figcaption></figure>

### Further information

* [Splitting combination codes in detailed longitudinal data](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/detailed-longitudinal-data/what-are-combination-codes-and-how-they-are-separated-in-detailed-longitudinal-data.md)
* [Registers in detailed longitudinal data](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/registers-in-the-detailed-longitudinal-data.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/detailed-longitudinal-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
