# Data

## Data location

### Sandbox

* `/finngen/library-red/finngen_R13/kanta_lab_2.0/data/finngen_R13_kanta_lab_2.0.txt.gz`\
  in textual TSV-gzipped format (for use with `awk`, `grep`, UNIX piping)
* `/finngen/library-red/finngen_R13/kanta_lab_2.0/data/finngen_R13_kanta_lab_2.0.parquet`\
  in binary Parquet format (for use with Python pandas, R data.frame)

### BigQuery

Available in this table:\
`finngen-production-library.sandbox_tools_r12.kanta_r12_v1`

## Data columns

N.B. The raw data contains a `MEASUREMENT_FREE_TEXT` column that unfortunately cannot be directly released as it contains data that is potentially sensitive. It contains a mix of numerical measurement values, positive/negative outcomes, outcomes linked to thresholds (e.g. <3ml) and general notes. Our approach has been to extract such data from the original column through a process of cleaning and whitelisting of the field.

### Overview

This table shows the ordered list of columns in the Kanta lab data, with brief descriptions of their meaning and whether the columns are present in either the sandbox and/or ETL Kanta lab data.

<table><thead><tr><th>Column</th><th>Description</th><th width="57" align="center">SB</th><th width="66" align="center">ETL</th></tr></thead><tbody><tr><td><code>ROW_ID</code></td><td>Identifying number of entry</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>FINNGENID</code></td><td>Study ID (Pseudonymised ID given to the FinnGen participant)</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>SEX</code></td><td>Sex of the individual, <code>female</code> or <code>male</code></td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>EVENT_AGE</code></td><td>Age (in years) at time of event, e.g. <code>12.012</code></td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>APPROX_EVENT_DATETIME</code></td><td>Date (randomized) and time (not randomized) of event, e.g. <code>2020-01-02T07:30</code> (<a href="#q-how-reliable-are-the-time-measurements-in-the-data">see details</a>)</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>OMOP_CONCEPT_ID</code></td><td>OMOP Concept ID mapped from the <code>TEST_ID</code> and <code>MEASUREMENT_UNIT</code></td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>TEST_NAME</code></td><td>Short name of the lab test, e.g. <code>p-alat, s-tsh</code></td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_VALUE_HARMONIZED</code></td><td>Value of the test measurement, after harmonization across the OMOP Concept ID. This column is the basic column of measurement values to be used.</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_UNIT_HARMONIZED</code></td><td>Corresponding unit for the harmonized measurement value</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_VALUE_EXTRACTED</code></td><td>Value of the test measurement extracted from the <code>MEASUREMENT_FREE_TEXT</code> column. It was observed that some labs report values only in <code>MEASUREMENT_FREE_TEXT</code> column, instead of the basic test value column. We extracted these numerical values if there was only numerical value in the <code>MEASUREMENT_FREE_TEXT</code> column. For these measurements, there are no unit reported to us but is assumed to be in the most common harmonized unit. This was verified for the majority of the values but care should be taken (look at distributions and outliers) when using these values.</td><td align="center"></td><td align="center"></td></tr><tr><td><code>MEASUREMENT_VALUE_MERGED</code></td><td>Harmonized and extracted values merged together. This column simply combines columns <code>MEASUREMENT_VALUE_HARMONIZED</code> and <code>MEASUREMENT_VALUE_EXTRACTED</code>  <br>                  </td><td align="center"></td><td align="center"></td></tr><tr><td><code>TEST_OUTCOME</code></td><td>Label given for the outcome of the test to indicate how it falls against the reference range (<a href="#test_outcome">see value table</a>)</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>TEST_OUTCOME_IMPUTED</code></td><td>Imputed test outcome (<a href="#test_outcome_imputed">see value table</a>)</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>TEST_OUTCOME_TEXT_EXTRACTED</code></td><td><code>[&#x3C;|>]|[VALUE]|[UNIT?]</code>  extracted from the <code>MEASUREMENT_FREE_TEXT</code> column</td><td align="center"></td><td align="center"></td></tr><tr><td><code>OUTCOME_POS_EXTRACTED</code></td><td>1(pos) or 0 (neg) outcome extracted from the  <code>MEASUREMENT_FREE_TEXT</code> column</td><td align="center"></td><td align="center"></td></tr><tr><td><code>TEST_ID_IS_NATIONAL</code></td><td>Whether or not the <code>TEST_ID</code> is using the national lab test code system</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>MEASUREMENT_VALUE</code></td><td>Value of the test measurement</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_UNIT</code></td><td>Corresponding unit for the test measurement</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_STATUS</code></td><td>Code indicating the status of the lab test measurement (<a href="#measurement_status">see value table</a>)</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>REFERENCE_RANGE_GROUP</code></td><td>Reference range for this event, as text</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>REFERENCE_RANGE_LOW_VALUE</code></td><td>Value for the low bound of the reference range</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>REFERENCE_RANGE_LOW_UNIT</code></td><td>Corresponding unit for the low bound of the reference range</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>REFERENCE_RANGE_HIGH_VALUE</code></td><td>Value for the high bound of the reference range</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>REFERENCE_RANGE_HIGH_UNIT</code></td><td>Corresponding unit for the high bound of the reference range</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>CODING_SYSTEM_ORG</code></td><td>Derived from <code>CODING_SYSTEM_OID</code></td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>CODING_SYSTEM_OID</code></td><td>Original name: <code>tutkimuskoodistonjarjestelmaid</code></td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>TEST_ID_SOURCE</code></td><td>Code of the lab test, as it appeared before preprocessing of the data</td><td align="center"></td><td align="center">✓</td></tr><tr><td><code>TEST_NAME_SOURCE</code></td><td>Short name of the lab test, as it appeared before preprocessing of the data</td><td align="center"></td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_VALUE_SOURCE</code></td><td>Value of the test measurement, as it appeared before data cleaning</td><td align="center"></td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_UNIT_SOURCE</code></td><td>Unit of the test measurement, as it appeared before data cleaning</td><td align="center"></td><td align="center">✓</td></tr></tbody></table>

\
\
Extended columns&#x20;

In addition to the core file, there's another file containing metadata columns that are either mostly empty or contain information that we haven't quite been able to decipher yet. It also contains some source data (e.g. test name, source value, source unit) that can be used to identify possible bugs in our pipeline. The two files can be merged via the `ROW_ID` column.

<table><thead><tr><th>Column</th><th>Description</th><th width="57" align="center">SB</th><th width="66" align="center">ETL</th></tr></thead><tbody><tr><td><code>ROW_ID</code></td><td>Identifying number of entry</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>FINNGENID</code></td><td>Study ID (Pseudonymised ID given to the FinnGen participant)</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>SEX</code></td><td>Sex of the individual, <code>female</code> or <code>male</code></td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>EVENT_AGE</code></td><td>Age (in years) at time of event, e.g. <code>12.012</code></td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>APPROX_EVENT_DATETIME</code></td><td>Date (randomized) and time (not randomized) of event, e.g. <code>2020-01-02T07:30</code> (<a href="#q-how-reliable-are-the-time-measurements-in-the-data">see details</a>)</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>OMOP_CONCEPT_ID</code></td><td>OMOP Concept ID mapped from the <code>TEST_ID</code> and <code>MEASUREMENT_UNIT</code></td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>TEST_ID</code></td><td>Code of the lab test, as it appeared before preprocessing of the data</td><td align="center"></td><td align="center">✓</td></tr><tr><td><code>TEST_ID_IS_NATIONAL</code></td><td>Whether or not the <code>TEST_ID</code> is using the national lab test code system</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>TEST_NAME_SOURCE</code></td><td>Short name of the lab test, as it appeared before preprocessing of the data</td><td align="center"></td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_VALUE_SOURCE</code></td><td>Value of the test measurement, as it appeared before data cleaning</td><td align="center"></td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_UNIT_SOURCE</code></td><td>Unit of the test measurement, as it appeared before data cleaning</td><td align="center"></td><td align="center">✓</td></tr><tr><td><code>MEASUREMENT_STATUS</code></td><td>Code indicating the status of the lab test measurement (<a href="#measurement_status">see value table</a>)</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>REFERENCE_RANGE_GROUP</code></td><td>Reference range for this event, as text</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>REFERENCE_RANGE_LOW_VALUE</code></td><td>Value for the low bound of the reference range</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>REFERENCE_RANGE_LOW_UNIT</code></td><td>Corresponding unit for the low bound of the reference range</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>REFERENCE_RANGE_HIGH_VALUE</code></td><td>Value for the high bound of the reference range</td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>REFERENCE_RANGE_HIGH_UNIT</code></td><td>Corresponding unit for the high bound of the reference range</td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>CODING_SYSTEM_ORG</code></td><td>Derived from <code>CODING_SYSTEM_OID</code></td><td align="center">✓</td><td align="center"></td></tr><tr><td><code>CODING_SYSTEM_OID</code></td><td>Original name: <code>tutkimuskoodistonjarjestelmaid</code></td><td align="center">✓</td><td align="center">✓</td></tr><tr><td><code>SERVICE_PROIVDER_OID</code></td><td>Probably the id of the place where the lab was taken/processed. Original name <code>antaja_organisaatioid</code></td><td align="center"></td><td align="center"></td></tr></tbody></table>

### `TEST_OUTCOME`

This column provides a label comparing the measured value against a reference range.

<table><thead><tr><th width="106">Value</th><th>Description</th></tr></thead><tbody><tr><td><code>N</code></td><td>Normal</td></tr><tr><td><code>A</code></td><td>Abnormal</td></tr><tr><td><code>AA</code></td><td>Very abnormal</td></tr><tr><td><code>L</code></td><td>Low</td></tr><tr><td><code>LL</code></td><td>Very low</td></tr><tr><td><code>H</code></td><td>High</td></tr><tr><td><code>HH</code></td><td>Very high</td></tr></tbody></table>

### `TEST_OUTCOME_IMPUTED`

Some rows are missing the `TEST_OUTCOME`, so an imputed one is provided. The `TEST_OUTCOME_IMPUTED` is derived by looking at the data from the same OMOP Concept ID for which there are `MEASUREMENT_VALUE` and `TEST_OUTCOME`  for a minimum (100) number of entries. The process for determining the thresholds are as following.\
\
Values with both measurement (harmonized) and outcome are sorted by value, with the outcome labels sorted following the same order. E.g.

| Value | OUTCOME |
| ----- | ------- |
| 1     | L       |
| 1     | L       |
| 1.3   | N       |
| ...   | ...     |
| 7     | N       |
| 14    | H       |
| 15    | H       |

Starting from the lower end, we expect to find mostly low (L) values and then gradually find normal (N) ones. So in order to find the turnover point where Ns become the majority we define a relative measure of # of L entries/ all other entries. In ideal scenarios, this value starts at 100% and start to gradually decline as more Ns (or other entries, like A and H) start to appear. When the relative measure drops under 95% for the last time, we define the threhold there. The same is done from the oppoiste side with H.  The summary of the threshold can be found in the [repo](https://github.com/FINNGEN/kanta_lab_preprocessing/blob/master/finngen_qc/data/abnormality_estimation.table.tsv).\
\
In the process we found two kind of anomalies, mainly due to an asymmetric distribution of labels:

* `+- inf`  thresholds. In these cases not enough labels are present at the tails of the distribution. In the algorithm the starting thresholds are defined as such, but they never get updated as the ratio of labels never climbs above the 95% threshold to begin with. This is usually associated with lab values where there is no such thing as L/H (e.g. Triglycerides) or where the labels used ar `A`  instead of `H|L`&#x20;
* `PROBLEM`  column. This boolean column indicates when the opposite issue appears, that is we traverse the whole list of values up to the median still being above the 95% threshold and the median value is therefore used as a threshold. This indicates that there's a heavy bias in the distribution of outcome labels and thus one should proceed with caution. Values imputed with these thresholds are labelled with a `*`  , e.g. `L*`  or `H*`

The content of the column is as following:

<table><thead><tr><th width="103">Value</th><th>Description</th></tr></thead><tbody><tr><td><code>N</code></td><td>Imputed Normal</td></tr><tr><td><code>L</code></td><td>Imputed Low</td></tr><tr><td><code>L*</code></td><td>Imputed Low. Less confidence in the imputation due to over-representation of <code>L</code> and <code>H</code> from <code>TEST_OUTCOME</code></td></tr><tr><td><code>H</code></td><td>Imputed High</td></tr><tr><td><code>H*</code></td><td>Imputed High. Less confidence in the imputation due to over-representation of <code>L</code> and <code>H</code> from <code>TEST_OUTCOME</code></td></tr></tbody></table>

### `MEASUREMENT_STATUS`

<table><thead><tr><th width="105">Value</th><th>Description</th></tr></thead><tbody><tr><td><code>C</code></td><td>Corrected result</td></tr><tr><td><code>F</code></td><td>Final result</td></tr><tr><td><code>R</code></td><td>Unverified result</td></tr><tr><td><code>S</code></td><td>Partial result</td></tr></tbody></table>

## Pipeline

The pipeline is available in github (<https://github.com/FINNGEN/kanta_lab_preprocessing/>) where technical information on how the raw data was processed can be found.&#x20;

A quick summary:

* duplicate entries are removed (based on id,date,lab test name/code/measurement status & value)
* text is processed to remove spaces and strange characters
* test national codes are mapped to names based on known mappings
* units are cleaned/uniformized and mapped to OMOP based on lab test
* units are harmonized based on OMOP IDs
* Another duplication removal step takes place post harmonization to intercept duplicate entries from different systems (checking for ID,date,harmonized test name,value and status)

## Values extraction analysis

A key aspect of the v2 kanta data has been the extraction of information from the `MEASUREMENT_FREE_TEXT` column. Here we want to explain how this took place.

### Summary

The pipeline is available in github (<https://github.com/FINNGEN/kanta_lab_preprocessing/>) where technical information on how the raw data was processed can be found.&#x20;

A quick summary:

* the `MEASUREMENT_FREE_TEXT` column is manipulated to extract shareable information
  * Where the original measurement value is missing and the free text is available, we attempt to extract numerical values from it if they match certain patterns. After some string manipulation if we're left with a pure number we cast it from string to float and is used to populate the `MEASUREMENT_VALUE_EXTRACTED`  column
  * The text is scanned for pos/neg substrings and through a manual mapping, values are mapped to 1 (pos) or 0 (neg) in a new `OUTCOME_POS_EXTRACTED` column
  * The text is scanned to look for entries that indicate outcome as a comparison and are structured as such:&#x20;

    * comparison (Yli/alle/\</>)
    * numerical value
    * unit (potentially missing)

    These entries are manipulated in order to be standardized following the format  `[<|>]|[VALUE]|[UNIT?]`  so they can be shared safely.

* QCing takes place to remove extracted values that are formatted as dates&#x20;

### Extraction Summary

In the following table one can find a summary of the free text extraction process.&#x20;

| OMOP    | N\_EXTRACTED                    | %\_EXTRACTED                             | %\_NA\_MEASUREMENT                                                 | N\_POSNEG                     | %\_EXTRACTED                           | %\_NA\_OUTCOME                                                 | conceptName                       |
| ------- | ------------------------------- | ---------------------------------------- | ------------------------------------------------------------------ | ----------------------------- | -------------------------------------- | -------------------------------------------------------------- | --------------------------------- |
| OMOP ID | N of extracted numerical values | Percentage of numerical values extracted | Percentage of extracted values that had NA in raw data measurement | N of extracted POS/NEG values | Percentage of POS/NEG extracted values | Percentage of extracted values that had NA in raw data outcome | Concept Name                      |
| 3026361 | 2095662                         | 22.6799                                  | 100.0000                                                           | 2                             | 0.0000                                 | 100.0000                                                       | Erythrocytes \[#/volume] in Blood |
| 3018095 | 118284                          | 22.3749                                  | 100.0000                                                           | 67950                         | 12.8536                                | 6.2384                                                         | Leukocytes \[#/volume] in Urine   |

{% file src="<https://3072695768-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MhYL0UTLjqsuIdK0SSO%2Fuploads%2F2j7B3t9VBTKJHR23ZNcv%2Fextraction_summary_names.txt?alt=media&token=0a59f059-78d6-4b20-a86e-4199977c68aa>" %}

## Other reference tables

### Test name abbreviations

Test name abbreviations come from different laboratory testing centers around Finland. Some are standardized nationally and some are used only locally in different hospitals and test centers.

We have put a lot of effort into standardizing these to international OHDSI OMOP Concept ID (primarily from the [LOINC database](https://loinc.org/downloads/)) so we hope that you do not need to interpret them very often!  However, in case you have reason to use them, we provide the meaning of most abbreviations here.

**Prefixes for lab test name abbreviations**

<table data-header-hidden><thead><tr><th width="137">Prefix</th><th>Description</th></tr></thead><tbody><tr><td>aB</td><td>Arterial blood</td></tr><tr><td>Af</td><td>Puncture fluid</td></tr><tr><td>aG</td><td>Alveolar gas</td></tr><tr><td>Am</td><td>Amniotic fluid</td></tr><tr><td>As</td><td>Ascitic fluid </td></tr><tr><td>B</td><td>Blood</td></tr><tr><td>Bf</td><td>Bronchus fluid</td></tr><tr><td>Bi</td><td>Bile</td></tr><tr><td>Bl</td><td>Bronchoalveolar lavation</td></tr><tr><td>Bm</td><td>Bone Marrow</td></tr><tr><td>Bo</td><td>Bone</td></tr><tr><td>Br</td><td>Breast</td></tr><tr><td>Bu</td><td>Bursa</td></tr><tr><td>Ca</td><td>Cannula/IV port</td></tr><tr><td>cB</td><td>Capillary blood</td></tr><tr><td>Cf</td><td>Cervix fluid</td></tr><tr><td>Cn</td><td>Central nervous system</td></tr><tr><td>cU</td><td>Collected urine</td></tr><tr><td>Cv</td><td>Choroid villus</td></tr><tr><td>Di</td><td>Dialysis fluid</td></tr><tr><td>Dj</td><td>Duodenal juice</td></tr><tr><td>dU</td><td>Diurnal urine</td></tr><tr><td>E</td><td>Erythrocyte</td></tr><tr><td>Ex</td><td>Sputum</td></tr><tr><td>F</td><td>Fecal</td></tr><tr><td>fB</td><td>Fasting blood</td></tr><tr><td>Fl</td><td>Vaginal fluor</td></tr><tr><td>fP</td><td>Fasting plasma</td></tr><tr><td>fS</td><td>Fasting serum</td></tr><tr><td>Gi</td><td>Gastrointestinal</td></tr><tr><td>Gj</td><td>Gastric juice</td></tr><tr><td>Hb</td><td>Hemoglobin</td></tr><tr><td>He</td><td>Heart</td></tr><tr><td>Ki</td><td>Kidney</td></tr><tr><td>L</td><td>Leukocytes</td></tr><tr><td>Lf</td><td>Lacrimal fluid</td></tr><tr><td>Li</td><td>Likvor/CSF</td></tr><tr><td>Ln</td><td>Lymph Node</td></tr><tr><td>Lr</td><td>Liver</td></tr><tr><td>Lu</td><td>Lung</td></tr><tr><td>Ly</td><td>Lymphocytes</td></tr><tr><td>M</td><td>Muscle</td></tr><tr><td>mB</td><td>Machine blood</td></tr><tr><td>Me</td><td>Meconium</td></tr><tr><td>Mf</td><td>Mammary fluid</td></tr><tr><td>Mm</td><td>Maternal milk</td></tr><tr><td>Mu</td><td>Mucosa</td></tr><tr><td>Ne</td><td>Nerve</td></tr><tr><td>Ns</td><td>Nasal secretion</td></tr><tr><td>nU</td><td>Nocturnal urine</td></tr><tr><td>P</td><td>plasma</td></tr><tr><td>Pd</td><td>Peritoneal dialysis</td></tr><tr><td>Pf</td><td>Pleura</td></tr><tr><td>Pi</td><td>Pituitary gland</td></tr><tr><td>Pl</td><td>Placenta</td></tr><tr><td>Pp</td><td>Periodontal pocket</td></tr><tr><td>Ps</td><td>Pharyngeal secretion</td></tr><tr><td>Pt</td><td>Patient</td></tr><tr><td>Pu</td><td>Pus</td></tr><tr><td>S</td><td>Serum</td></tr><tr><td>Sa</td><td>Saliva</td></tr><tr><td>Se</td><td>Secretion</td></tr><tr><td>Sk</td><td>Skin</td></tr><tr><td>Sp</td><td>Semen</td></tr><tr><td>Sw</td><td>Sweat</td></tr><tr><td>Sy</td><td>Syncytial fluid</td></tr><tr><td>T</td><td>Thrombocyte</td></tr><tr><td>Ts</td><td>Tissue</td></tr><tr><td>Tu</td><td>Tumor</td></tr><tr><td>U</td><td>Urine</td></tr><tr><td>uA</td><td>Umbilical arterial blood</td></tr><tr><td>Ug</td><td>Urogenital</td></tr><tr><td>uS</td><td>Umbilical serum</td></tr><tr><td>uV</td><td>Umbilical venous blood</td></tr><tr><td>vB</td><td>Venous blood</td></tr><tr><td>W</td><td>Water</td></tr></tbody></table>

**Suffixes for lab test name abbreviations**

<table data-header-hidden><thead><tr><th width="138"></th><th></th></tr></thead><tbody><tr><td>-Ab</td><td>Antibody</td></tr><tr><td>-AbA</td><td>IgA antibody</td></tr><tr><td>-AbE</td><td>IgE antibody</td></tr><tr><td>-AbG</td><td>IgG antibody</td></tr><tr><td>-AbM</td><td>IgM antibody</td></tr><tr><td>-Ag</td><td>Antigen</td></tr><tr><td>-Akt</td><td>Activity</td></tr><tr><td>-Aktt</td><td>Activation products</td></tr><tr><td>-Cl</td><td>Clearance</td></tr><tr><td>-Ct</td><td>Control</td></tr><tr><td>-D</td><td>DNA</td></tr><tr><td>-Di</td><td>Dialysis</td></tr><tr><td>-EVi</td><td>Special culture</td></tr><tr><td>-EM</td><td>Electron Microscopic</td></tr><tr><td>-F</td><td>Fetal</td></tr><tr><td>-Fc</td><td>Flow cytometry</td></tr><tr><td>-Fr</td><td>Fraction</td></tr><tr><td>-Gr</td><td>Gestational</td></tr><tr><td>-IF</td><td>Immunofluorescence</td></tr><tr><td>-IH</td><td>Immunohistochemistry</td></tr><tr><td>-Ind</td><td>Index</td></tr><tr><td>-Ion</td><td>Ionized</td></tr><tr><td>-Is</td><td>Iso enzymes</td></tr><tr><td>-ISH</td><td>in situ -hybridisation</td></tr><tr><td>-Jtk</td><td>Follow-up study</td></tr><tr><td>-Jvi</td><td><p>Follow-up culture</p><p>  (jatkoviljely)</p></td></tr><tr><td>-Kj</td><td>Conjugate</td></tr><tr><td>-Lm</td><td>Species specificity</td></tr><tr><td>-MS</td><td>Mass spectrometry</td></tr><tr><td>-Nh</td><td>Nucleic acid</td></tr><tr><td>-O</td><td>Qualitative</td></tr><tr><td>-Oc</td><td>Oligoclonal</td></tr><tr><td>-Pa</td><td>Long term</td></tr><tr><td>-Pse</td><td>Screening and categorization</td></tr><tr><td>-PT</td><td>Rapid test</td></tr><tr><td>-R</td><td>Exercise stress test</td></tr><tr><td>-S</td><td>Stimulation</td></tr><tr><td>-Sc</td><td>Sub classes</td></tr><tr><td>-Ty</td><td>Typing</td></tr><tr><td>-V</td><td>Free or unconjugated</td></tr><tr><td>-Vi</td><td>Microbiology culture (e.g. u-Baktvi = bacterial culture from urine, ps-stravi = Strep A culture in pharayngeal secretion, F-sienVi = fungal culture in stool)</td></tr><tr><td>-Vit</td><td>Vitamine</td></tr><tr><td>-Vr</td><td>Staining</td></tr><tr><td>-Vt</td><td>Point of care (vieritesti), often a rapid test</td></tr></tbody></table>

### Reference range terms

Test reference ranges are a free text string that can have a lot of Finnish in them. Below can be found a list of translations of the most common words seen in reference ranges:

**General terms:**

* AIKUISET: Adults
* ALLE: Under/Below
* ALK: Abbreviation for "alkaen", meaning "starting from" or "beginning at"
* ALTISTUMATTOMAT: Unexposed (individuals)
* AAMUNÄYTE: Morning sample
* EDELLEEN: Still, continuing
* FERTIILI-IKÄ: Fertile age
* HOITOALUE: Treatment range
* JA: And
* JÄÄNNÖSPIT: Residual concentration
* KAIKKI: All, everyone
* KATSO: See, look at
* KK: Abbreviation for "kuukausi", meaning month
* KS: Abbreviation for "katso", meaning "see" or "look at"
* KTS: Another abbreviation for "katso"
* KYMENLAAKSONLAB: Kymenlaakso Laboratory (a specific lab in Finland)
* LAPSET: Children
* LEUK: Leukocytes (white blood cells)
* LIER: Likely referring to "lieriöt", meaning casts (in urine analysis)
* MIEHET: Men
* NAISET: Women
* NEGAT: Negative
* NORMAALI: Normal
* OHJEKIRJA: Manual, guidebook
* PAASTO: Fasting
* POJAT: Boys
* POSTMENOPAUSSI: Postmenopausal
* PREMENOPAUSAALISET: Premenopausal
* PUBERT: Puberty
* RASKAUS: Pregnancy
* SUOSITELTAVA: Recommended
* TAVOITE: Target, goal
* TAVOITEARVO: Target value
* TERAP: Therapeutic
* TOKSINEN: Toxic
* TULKINTA: Interpretation
* TUPAKOIMATTOMAT: Non-smokers
* TYTÖT: Girls
* V: Abbreviation for "vuosi", meaning year
* VASTASYNT: Newborn
* VIITEARVO: Reference value
* VKO: Abbreviation for "viikko", meaning week
* VRK: Abbreviation for "vuorokausi", meaning day (24-hour period)
* YLI: Over, above

**Age-related terms:**

* 0-6PV: 0-6 days
* 1KK-1V: 1 month to 1 year
* 1V-: 1 year and older
* 2-4V: 2-4 years
* 5-10V: 5-10 years
* 11-15V: 11-15 years
* 16V-: 16 years and older
* 18V-: 18 years and older

**Medical terms:**

* ERYT: Erythrocytes (red blood cells)
* EPIT.SOLUT: Epithelial cells
* FOLLIKK.VAIHE: Follicular phase (of menstrual cycle)
* MAKUU: Lying down (usually referring to blood pressure measurement)
* MENARKEA: Menarche (first menstrual period)
* PYSTY: Standing (usually referring to blood pressure measurement)
