# Interpretation of Endpoint Definition file

This document explains how to attribute an endpoint to events in the detailed longitudinal data using the rules from the endpoint definition file (latest version at [FinnGen: Clinical Endpoints](https://www.finngen.fi/en/researchers/clinical-endpoints)). Have a look at the [list of gotchas](#gotchas) at the end of this document for some specificities that are easy to miss at first.

Each endpoint is defined by a set of rules, given as one line in the endpoint definition file. The detailed longitudinal file contains health events (*rows in that file*) that will be looked up against these rules. Each rule will add or remove events to the list of candidate events. Once all rules have been applied, the remaining candidate events are attributed to the endpoint.

#### When explaining the rules, the following terms are used:

* **Endpoint**: occurrence of a health event defined by rules that match on the health register data.
* **Candidate events**: list of events that could be attributed to the endpoint. This list grows and shrinks as the endpoint rules are applied.
* **Consider**: add event to the list of candidate events.
* **Discard**: remove event from the list of candidate events.

## Overview of the Endpoint Definition File

The endpoint definition file version 1.3 has the following metadata columns:

<table><thead><tr><th width="276">Column name</th><th>Explanation</th></tr></thead><tbody><tr><td><code>NAME</code></td><td>naming: Reference name in the FinnGen endpoint data</td></tr><tr><td><code>LONGNAME</code></td><td>naming: Descriptive name</td></tr><tr><td><code>Latin</code></td><td>naming: Latin name</td></tr><tr><td><code>TAGS</code></td><td>categorisation: List of categories the endpoint belongs to</td></tr><tr><td><code>LEVEL</code></td><td>categorisation: Level in the ICD-10 hierarchy</td></tr><tr><td><code>OMIT</code></td><td>categorisation: Is a core GWAS? (NA: yes, 1 or 2: no)</td></tr><tr><td><code>PARENT</code></td><td>categorisation: Parent in the ICD-10 hierarchy</td></tr><tr><td><code>version</code></td><td>changelog: introduced in data freeze</td></tr><tr><td><code>Modification_date</code></td><td>changelog: date of last modification</td></tr><tr><td><code>Modified_by</code></td><td>changelog: author of last modification</td></tr><tr><td><code>Modification_reason</code></td><td>changelog: purpose of modification</td></tr><tr><td><code>Special</code></td><td>free text notes</td></tr></tbody></table>

The rules are defined by the following columns in the endpoint definition file:\
\&#xNAN;*(Click on a value in "**Column name**" or "**Extra rules**"*, *where available*, *to be directed to further details that follow the table)*

| Column name                                   | Purpose                            | [Coding system](#appendix-coding-systems-and-translations) | [Lookup `SOURCE` registry](#appendix-list-of-registries) | Extra rules                                                                                                                         |
| --------------------------------------------- | ---------------------------------- | ---------------------------------------------------------- | -------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
| `SEX`                                         | Filter at the FINNGENID level      | –                                                          | –                                                        | –                                                                                                                                   |
| `INCLUDE`                                     | Use other endpoints to find events | –                                                          | –                                                        | –                                                                                                                                   |
| `PRE_CONDITIONS`                              | Filter at the event level          | –                                                          | –                                                        | –                                                                                                                                   |
| `CONDITIONS`                                  | Filter at the FINNGENID level      | –                                                          | –                                                        | –                                                                                                                                   |
| [OUTPAT\_ICD](#outpat_icd)                    | Inclusion lookup                   | ICD-10                                                     | `PRIM_OUT`                                               | [any-code](#any-code), [match-prefix](#match-prefix)                                                                                |
| [OUTPAT\_OPER](#outpat_oper)                  | Inclusion lookup                   | NOMESCO                                                    | `PRIM_OUT`                                               | [any-code](#any-code)                                                                                                               |
| [HD\_MAINONLY](#hd_mainonly)                  | Diagnosis selection hint           | –                                                          | `INPAT`, `OUTPAT`                                        | –                                                                                                                                   |
| [HD\_ICD\_10\_ATC](#hd_icd_-10-_atc)          | Inclusion lookup                   | ATC                                                        | `INPAT`, `OUTPAT`                                        | [any-code](#any-code), [match-prefix](#match-prefix)                                                                                |
| [HD\_ICD\_10](#hd_icd_10)                     | Inclusion lookup                   | ICD-10                                                     | `INPAT`, `OUTPAT`                                        | [any-code](#any-code), [match-prefix](#match-prefix), [cause-symptom](#cause-symptom), [mode](#mode), [mark-no-code](#mark-no-code) |
| [HD\_ICD\_9](#hd_icd_9)                       | Inclusion lookup                   | ICD-9                                                      | `INPAT`, `OUTPAT`                                        | [any-code](#any-code), [match-prefix](#match-prefix), [mode](#mode), [mark-no-code](#mark-no-code)                                  |
| [HD\_ICD\_8](#hd_icd_8)                       | Inclusion lookup                   | ICD-8                                                      | `INPAT`, `OUTPAT`                                        | [any-code](#any-code), [match-prefix](#match-prefix), [mode](#mode), [mark-no-code](#mark-no-code)                                  |
| [HD\_ICD\_10\_EXCL](#hd_icd_-10-_excl)        | Exclusion lookup                   | ICD-10                                                     | `INPAT`, `OUTPAT`                                        | [any-code](#any-code), [match-prefix](#match-prefix), [cause-symptom](#cause-symptom)                                               |
| [HD\_ICD\_9\_EXCL](#hd_icd_-9-_excl)          | Exclusion lookup                   | ICD-9                                                      | `INPAT`, `OUTPAT`                                        | [any-code](#any-code), [match-prefix](#match-prefix)                                                                                |
| [HD\_ICD\_8\_EXCL](#hd_icd_-8-_excl)          | Exclusion lookup                   | ICD-8                                                      | `INPAT`, `OUTPAT`                                        | [any-code](#any-code), [match-prefix](#match-prefix)                                                                                |
| [COD\_MAINONLY](#cod_mainonly)                | Diagnosis selection hint           | –                                                          | `DEATH`                                                  | –                                                                                                                                   |
| [COD\_ICD\_10](#cod_icd_10)                   | Inclusion lookup                   | ICD-10                                                     | `DEATH`                                                  | [any-code](#any-code), [match-prefix](#match-prefix), [mark-no-code](#mark-no-code)                                                 |
| [COD\_ICD\_9](#cod_icd_9)                     | Inclusion lookup                   | ICD-9                                                      | `DEATH`                                                  | [any-code](#any-code), [match-prefix](#match-prefix), [mark-no-code](#mark-no-code)                                                 |
| [COD\_ICD\_8](#cod_icd_8)                     | Inclusion lookup                   | ICD-8                                                      | `DEATH`                                                  | [any-code](#any-code), [match-prefix](#match-prefix), [mark-no-code](#mark-no-code)                                                 |
| [COD\_ICD\_10\_EXCL](#cod_icd_-10-_excl)      | Exclusion lookup                   | ICD-10                                                     | `DEATH`                                                  | [any-code](#any-code), [match-prefix](#match-prefix), [mark-no-code](#mark-no-code)                                                 |
| [COD\_ICD\_9\_EXCL](#hd_icd_-9-_excl)         | Exclusion lookup                   | ICD-9                                                      | `DEATH`                                                  | [any-code](#any-code), [match-prefix](#match-prefix)                                                                                |
| [COD\_ICD\_8\_EXCL](#hd_icd_-8-_excl)         | Exclusion lookup                   | ICD-8                                                      | `DEATH`                                                  | [any-code](#any-code), [match-prefix](#match-prefix)                                                                                |
| [OPER\_NOM](#oper_nom)                        | Inclusion lookup                   | NOMESCO                                                    | `OPER_IN`, `OPER_OUT`                                    | [any-code](#any-code)                                                                                                               |
| [OPER\_HL](#oper_hl)                          | Inclusion lookup                   | Finnish hospital league                                    | `OPER_IN`, `OPER_OUT`                                    | [any-code](#any-code)                                                                                                               |
| [OPER\_HP1](#oper_hp1)                        | Inclusion lookup                   | Demanding heart patient, old codes                         | `OPER_IN`, `OPER_OUT`                                    | [any-code](#any-code)                                                                                                               |
| [OPER\_HP2](#oper_hp2)                        | Inclusion lookup                   | Demanding heart patient, new codes                         | `OPER_IN`, `OPER_OUT`                                    | [any-code](#any-code)                                                                                                               |
| [KELA\_REIMB](#kela_reimb)                    | Inclusion lookup                   | KELA reimbursement code                                    | `REIMB`                                                  | [any-code](#any-code)                                                                                                               |
| [KELA\_REIMB\_ICD](#kela_reimb_icd)           | Inclusion lookup                   | ICD-10, ICD-9                                              | `REIMB`                                                  | [any-code](#any-code), [match-prefix](#match-prefix)                                                                                |
| [KELA\_ATC\_NEEDOTHER](#kela_atc_needother)   | Additional requirement hint        | –                                                          | `PURCH`                                                  | –                                                                                                                                   |
| [KELA\_ATC](#kela_atc)                        | Inclusion lookup                   | ATC                                                        | `PURCH`                                                  | [any-code](#any-code), [match-prefix](#match-prefix)                                                                                |
| [KELA\_VNRO\_NEEDOTHER](#kela_vnro_needother) | Additional requirement hint        | –                                                          | `PURCH`                                                  | –                                                                                                                                   |
| [KELA\_VNRO](#kela_vnro)                      | Inclusion lookup                   | VNRO                                                       | `PURCH`                                                  | –                                                                                                                                   |
| [CANC\_TOPO](#canc_topo)                      | Inclusion lookup                   | ICD-O-3 topography                                         | `CANC`                                                   | [any-code](#any-code), [match-prefix](#match-prefix), [canc-all](#canc-all), [mark-no-code](#mark-no-code)                          |
| [CANC\_TOPO\_EXCL](#canc_topo_excl)           | Exclusion lookup                   | ICD-O-3 topography                                         | `CANC`                                                   | [any-code](#any-code), [match-prefix](#match-prefix), [canc-all](#canc-all)                                                         |
| [CANC\_MORPH](#canc_morph)                    | Inclusion lookup                   | ICD-O-3 morphology                                         | `CANC`                                                   | [any-code](#any-code), [match-prefix](#match-prefix), [canc-all](#canc-all)                                                         |
| [CANC\_MORPH\_EXCL](#canc_morph_excl)         | Exclusion lookup                   | ICD-O-3 morphology                                         | `CANC`                                                   | [any-code](#any-code), [match-prefix](#match-prefix), [canc-all](#canc-all)                                                         |
| [CANC\_BEHAV](#canc_behav)                    | Inclusion lookup                   | ICD-O-3 behavior                                           | `CANC`                                                   | [any-code](#any-code), [match-prefix](#match-prefix), [canc-all](#canc-all)                                                         |

## Event Rules

### **OUTPAT\_ICD**

Consider events where:

* `SOURCE`: is `PRIM_OUT`
* and `CATEGORY`: contains `ICD`
* and `CODE1`: matches the `OUTPAT_ICD` regex

### **OUTPAT\_OPER**

Consider events where:

* `SOURCE`: is `PRIM_OUT`
* and `CATEGORY`: starts with `OP`
* and the `OUTPAT_OPER` regex matches `CODE1`

### **HD\_MAINONLY**

Values

* `YES`: only look at events with `CATEGORY`: `0` for the rules of `HD_ICD_10`, `HD_ICD_9`, `HD_ICD_8`, `HD_ICD_10_EXCL`, `HD_ICD_9_EXCL` and `HD_ICD_8_EXCL`
* `NA`: (nothing to filter)

This rule states to look only into the main diagnosis for hospital discharge events (as opposed to side diagnoses, where `CATEGORY` is not `0`).

### **HD\_ICD\_10\_ATC**

Consider events where:

* `SOURCE`: is `INPAT` or `OUTPAT`
* and the `HD_ICD_10_ATC` regex matches `CODE3`

This rule must be applied by looking for events that match both this rule and the `HD_ICD_10` rule at the same time.

For example, an endpoint definition with `HD_ICD_10` = `E610` and `HD_ICD_10_ATC` = `ANY` will match an event that has:

* `SOURCE`: `INPAT` or `OUTPAT`
* and `ICDVER`: 10
* and `HD_ICD_10` regex matches `CODE1` or `CODE2`
* and any code in `CODE3` (but there must be a code there, it cannot be empty)

### **HD\_ICD\_10**

Consider events where:

* `SOURCE`: is `INPAT` or `OUTPAT`
* and the `HD_ICD_10` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 10

### **HD\_ICD\_9**

Consider events where:

* `SOURCE`: is `INPAT` or `OUTPAT`
* and the `HD_ICD_9` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 9

### **HD\_ICD\_8**

Consider events where:

* `SOURCE`: is `INPAT` or `OUTPAT`
* and the `HD_ICD_8` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 8

### **HD\_ICD\_10\_EXCL**

Discard events where:

* `SOURCE`: is `INPAT` or `OUTPAT`
* and the `HD_ICD_10_EXCL` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 10

### **HD\_ICD\_9\_EXCL**

Discard events where:

* `SOURCE`: is `INPAT` or `OUTPAT`
* and the `HD_ICD_9_EXCL` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 9

### **HD\_ICD\_8\_EXCL**

Discard events where:

* `SOURCE`: is `INPAT` or `OUTPAT`
* and the `HD_ICD_8_EXCL` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 8

### **COD\_MAINONLY**

Values

* `YES`: only look at events with `CATEGORY`: `U` or `I` for the rules of `COD_ICD_10`, `COD_ICD_9`, `COD_ICD_8`, `COD_ICD_10_EXCL`, `COD_ICD_9_EXCL`, and `COD_ICD_8_EXCL`
* `NA`: (nothing to filter)

This rule states to look only into the main diagnosis for cause of death events (`CATEGORY`: `U` for underlying and `I` for immediate cause of death, as opposed to contributing cause of death `CATEGORY`: starts with `c`).

### **COD\_ICD\_10**

Consider events where:

* `SOURCE`: is `DEATH`
* and the `COD_ICD_10` regex matches `CODE1` or `CODE2`
* and the `ICDVER`: is 10

### **COD\_ICD\_9**

Consider events where:

* `SOURCE`: is `DEATH`
* and the `COD_ICD_9` regex matches `CODE1` or `CODE2`
* and the `ICDVER`: is 9

### **COD\_ICD\_8**

Consider events where:

* `SOURCE`: is `DEATH`
* and the `COD_ICD_8` regex matches `CODE1` or `CODE2`
* and the `ICDVER`: is 8

### **COD\_ICD\_10\_EXCL**

Discard events where:

* `SOURCE`: is `DEATH`
* and the `COD_ICD_10_EXCL` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 10

### **COD\_ICD\_9\_EXCL**

Discard events where:

* `SOURCE`: is `DEATH`
* and the `COD_ICD_9_EXCL` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 9

### **COD\_ICD\_8\_EXCL**

Discard events where:

* `SOURCE`: is `DEATH`
* and the `COD_ICD_8_EXCL` regex matches `CODE1` or `CODE2`
* and `ICDVER`: is 8

### **OPER\_NOM**

Consider events where:

* `SOURCE`: is `OPER_IN` or `OPER_OUT`
* and the `OPER_NOM` regex matches `CODE1`
* and `CATEGORY`: contains `NOM`

### **OPER\_HL**

Consider events where:

* `SOURCE`: is `OPER_IN` or `OPER_OUT`
* and the `OPER_HL` regex matches `CODE1`
* and `CATEGORY`: contains `FHL`

### **OPER\_HP1**

Consider events where:

* `SOURCE`: is `OPER_IN` or `OPER_OUT`
* and the `OPER_HP1` regex matches `CODE1`
* and `CATEGORY`: contains `HPO`

### **OPER\_HP2**

Consider events where:

* `SOURCE`: is `OPER_IN` or `OPER_OUT`
* and the `OPER_HP1` regex matches `CODE1`
* and `CATEGORY`: contains `HPN`

### **KELA\_REIMB**

Consider events where:

* `SOURCE`: is `REIMB`
* and `KELA_REIMB` regex matches `CODE1`

### **KELA\_REIMB\_ICD**

Consider events where:

* `SOURCE`: is `REIMB`
* and `KELA_REIMB_ICD` regex matches `CODE2`

This rule must be applied by looking for events that match both this rule and the `KELA_REIMB` rule at the same time.

### **KELA\_ATC\_NEEDOTHER**

Values

* `NA`: 3 events or more of the `KELA_ATC` rule are needed to attribute the endpoint
* `SINGLE_OK`: 1 event or more of `KELA_ATC` rule are needed to attribute the endpoint
* `YES`: the `KELA_ATC` rule is not sufficient by itself, another rule must be matching to attribute the endpoint

This rule sets additional requirements on the `KELA_ATC` rule.

### **KELA\_ATC**

Consider events where:

* `SOURCE`: is `PURCH`
* and `KELA_ATC` regex matches `CODE1`

### **KELA\_VNRO**

This rule is not used.

### **KELA\_VNRO\_NEEDOTHER**

This rule is not used.

### **CANC\_TOPO**

Consider events where:

* `SOURCE`: is `CANC`
* and the `CANC_TOPO` regex matches `CODE1`

### **CANC\_TOPO\_EXCL**

Discard events where:

* `SOURCE`: is `CANC`
* and the `CANC_TOPO_EXCL` regex matches `CODE1`

### **CANC\_MORPH**

Consider events where:

* `SOURCE`: is `CANC`
* and the `CANC_MORPH` regex matches `CODE2`

### **CANC\_MORPH\_EXCL**

Discard events where:

* `SOURCE`: is `CANC`
* and the `CANC_MORPH_EXCL` regex matches `CODE2`

### **CANC\_BEHAV**

Consider events where:

* `SOURCE`: is `CANC`
* and the `CANC_TOPO` regex matches `CODE3`

### **INCLUDE**

Value

* other endpoint names, separated by `|`

Attribute the current endpoint to an individual if it has at least one of the endpoints in `INCLUDE`.

### **PRE\_CONDITIONS**

Value

* condition on `EVENT_AGE` or `EVENT_YEAR`
* `EMERG`: (unused, nothing to do)
* `NA`: (nothing to do)

Discard events **not** matching `PRE_CONDITIONS` from the list of candidate events.

This rule usually applies a filter on age or year at the event. It filters out some events from the existing list of candidate events.

### **CONDITIONS**

An individual must fit the `CONDITIONS` rule to be attributed the endpoint.

### **SEX**

Values

* `1`: only keep males
* `2`: only keep females
* `NA`: (nothing to filter, the endpoint is not sex-specific)

This filter should be applied as the last filter.

## Extra rules

### **any-code**

When the rule is written as `ANY`, then the event must have a code for the given rule, but the actual code has no importance.

This rule is useful when matching an event against multiple rules, for example:

* `HD_ICD_10`: `K250`
* and `HD_ICD_10_ATC`: `ANY`

This example requires that an event has any ATC code and at the same time has the ICD-10 code `K250`. The endpoint will match drug-induced events since it requires there is an ATC code, but the actual ATC code doesn't matter.

### **match-prefix**

The rule must match starting from the beginning of its value, in regex terms it means the rule value has to be prepended with a `^`. This modified rule is then used as a regex.

For example, a match-prefix rule with a value of `I21` matches `I2100` but doesn't match `AEI21`.

### **cause-symptom**

An ampersand `&` between two codes indicates a cause-symptom pair (specific to Finnish ICD-10). In that case, both the cause code and the symptom code must be found in the same event.

For example, `HD_ICD_10` = `M07&L405` will match an event that has both `M07` (in `CODE1` or `CODE2`) and `L405` (in `CODE1` or `CODE2`).

### **mode**

A rule value starting with a percent sign `%` indicates a mode rule. The event will be considered only if the code is the most common amongst its sibling ICD codes for an individual.

For example `%J450` would match events of an individual only if `J450` is the most common code among the codes starting with `J45`.

### **canc-all**

When an endpoint has multiple cancer rules (from `CANC_TOPO`, `CANC_TOPO_EXCL`, `CANC_MORPH`, `CANC_MORPH_EXCL`, `CANC_BEHAV`) then it is not enough to match only one of them: all cancer rules that are defined must be satisfied by the event.

### mark-no-code

The mark `$!$` is used to state that someone has checked and there is no suitable code for this endpoint in a given registry.

For example, if an endpoint has `HD_ICD_9` with a value of `$!$` then it means someone has gone through the whole Finnish ICD-9 and reported that there is no code that can be from that.

## Gotchas

* One single event can span multiple rows in the detailed longitudinal data files: events are unique by (`FINNGENID`, `SOURCE`, `INDEX`), but not by row. Rows with the same values for `FINNGENID`, `SOURCE`, `INDEX` must be looked at as one single event when performing look-ups.
* The ICD-10, ICD-9 and ICD-8 used by FinnGen are specific Finnish versions which differ slightly from the international ones. This means for example that the ICD-10 found in FinnGen data are a bit different from the WHO ICD-10 or the US ICD-10-CM.
* In the FinnGen data, the ICD-O-3 is used for cancer codes.
* The dot `.` and the comma `,` are not present in the codes in the FinnGen files, e.g. `J45.1` would be `J451` in the endpoint definition file and the detailed longitudinal file.
* For rules that are regexes: a dot `.` means "any character" and not an actual dot.
* Endpoints with specific control rules are not documented here (yet!)

## Appendix: list of registries

| Name in FinnGen data (`SOURCE`) | Registry description                     |
| ------------------------------- | ---------------------------------------- |
| `CANC`                          | Cancer                                   |
| `DEATH`                         | Cause of death                           |
| `INPAT`                         | HILMO inpatient                          |
| `OPER_IN`                       | HILMO inpatient (operations)             |
| `OUTPAT`                        | HILMO specialist outpatient              |
| `OPER_OUT`                      | HILMO specialist outpatient (operations) |
| `PRIM_OUT`                      | AvoHILMO: primary care outpatient        |
| `PURCH`                         | Kela drug purchase                       |
| `REIMB`                         | Kela drug reimbursement                  |

## Appendix: coding systems and translations

* [Where to find the translation file for phenotype data](/finngen-data-specifics/finnish-health-registers-and-medical-coding/where-to-find-the-translation-file-for-phenotype-data.md), documentation from the FinnGen Handbook
* [Finnish ICD-10 book](http://urn.fi/URN:NBN:fi-fe201205085423)
* [Finnish ICD-9 book](http://urn.fi/URN:NBN:fi-fe201701261356)
* [Finnish ICD-8 book](http://urn.fi/URN:NBN:fi-fe201710058910)
* [ICD-O-3 book](https://apps.who.int/iris/bitstream/handle/10665/96612/9789241548496_eng.pdf)
* [NOMESCO book](http://norden.diva-portal.org/smash/get/diva2:968721/FULLTEXT01.pdf)
* ‌[ATC codes](https://www.whocc.no/atc_ddd_index/)
* ‌[ICPC2 codes](https://www.who.int/standards/classifications/other-classifications/international-classification-of-primary-care)

#### Glossary <a href="#orga79a9a9" id="orga79a9a9"></a>

* Kela: the Social Insurance Institution of Finland
* HILMO: Finnish care registers for health care


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/finngen-data-specifics/endpoints/how-to-interpret-endpoint-definition-file.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
