# Data

## Data location

### Sandbox

* `/finngen/library-red/finngen_R14/kanta_lab_1.0/data/finngen_R14_kanta_lab_1.0.txt.gz`\
  in textual TSV-gzipped format (for use with `awk`, `grep`, UNIX piping)
* `/finngen/library-red/finngen_R14/kanta_lab_1.0/data/finngen_R14_kanta_lab_1.0.parquet`\
  in binary Parquet format (for use with Python pandas, R data.frame)

### BigQuery

Available in this table:\
`` `TO BE ADDED` ``

## Data columns

N.B. The raw data contains a `MEASUREMENT_FREE_TEXT` column that unfortunately cannot be directly released as it contains data that is potentially sensitive. It contains a mix of numerical measurement values, positive/negative outcomes, outcomes linked to thresholds (e.g. <3ml) and general notes. Our approach has been to extract such data from the original column through a process of cleaning and whitelisting of the field. There are two files in the data folder:\
\- the regular `kanta_lab_[version].txt.gz|parquet`\
\- the `kanta_lab_[version]_extended_columns.txt.gz|parquet` file\
\
The former contains a streamlined version of the data for analysis containing all essential columns. In the extended columns instead also columns that contain information about source data (for debugging/backtracking purposes) and mostly empty columns is included.\
\
These are the columns present only in the extended columns file. `ROW_ID` can be used to connect the two files as it's a shared index.

```
TEST_ID_IS_NATIONAL
TEST_NAME_SOURCE
MEASUREMENT_VALUE_SOURCE
MEASUREMENT_UNIT_SOURCE
MEASUREMENT_STATUS
REFERENCE_RANGE_GROUP
REFERENCE_RANGE_[LOWER|UPPER]_[VALUE|UNIT]
CODING_SYSTEM_ORG
CODING_SYSTEM_OID
```

### Overview

This table shows the ordered list of columns in the Kanta lab data, with brief descriptions of their meaning and whether the columns are present in either the sandbox and/or ETL Kanta lab data (to be added at a later stage).

<table><thead><tr><th width="307.9111328125">Column</th><th width="301.433349609375">Description</th><th width="57" align="center">SB</th><th width="66" align="center">ETL</th></tr></thead><tbody><tr><td><code>ROW_ID</code></td><td>Identifying number of entry</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>FINNGENID</code></td><td>Study ID (Pseudonymised ID given to the FinnGen participant)</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>SEX</code></td><td>Sex of the individual, <code>female</code> or <code>male</code></td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>EVENT_AGE</code></td><td>Age (in years) at time of event, e.g. <code>12.012</code></td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>APPROX_EVENT_DATETIME</code></td><td>Date (randomized) and time (not randomized) of event, e.g. <code>2020-01-02T07:30</code> (<a href="#q-how-reliable-are-the-time-measurements-in-the-data">see details</a>)</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>OMOP_CONCEPT_ID</code></td><td>OMOP Concept ID mapped from the <code>TEST_ID</code> and <code>MEASUREMENT_UNIT</code></td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>TEST_NAME</code></td><td>Short name of the lab test, e.g. <code>p-alat, s-tsh</code></td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_VALUE_HARMONIZED</code></td><td>Value of the test measurement, after harmonization across the OMOP Concept ID. This column is the basic column of measurement values to be used.</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_UNIT_HARMONIZED</code></td><td>Corresponding unit for the harmonized measurement value</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_VALUE_EXTRACTED</code></td><td>Value of the test measurement extracted from the <code>MEASUREMENT_FREE_TEXT</code> column. It was observed that some labs report values only in <code>MEASUREMENT_FREE_TEXT</code> column, instead of the basic test value column. We extracted these numerical values if there was only numerical value in the <code>MEASUREMENT_FREE_TEXT</code> column. For these measurements, there are no unit reported to us but is assumed to be in the most common harmonized unit. This was verified for the majority of the values but care should be taken (look at distributions and outliers) when using these values.</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_VALUE_MERGED</code></td><td>Harmonized and extracted values merged together. This column simply combines columns <code>MEASUREMENT_VALUE_HARMONIZED</code> and <code>MEASUREMENT_VALUE_EXTRACTED</code><br></td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>TEST_OUTCOME</code></td><td>Label given for the outcome of the test to indicate how it falls against the reference range (<a href="#test_outcome">see value table</a>)</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>TEST_OUTCOME_IMPUTED</code></td><td>Imputed test outcome (<a href="#test_outcome_imputed">see value table</a>)</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>TEST_OUTCOME_TEXT_EXTRACTED</code></td><td><code>[&#x3C;|>]|[VALUE]|[UNIT?]</code> extracted from the <code>MEASUREMENT_FREE_TEXT</code> column</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>OUTCOME_POS_EXTRACTED</code></td><td>1(pos) or 0 (neg) outcome extracted from the <code>MEASUREMENT_FREE_TEXT</code> column</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>TEST_ID_IS_NATIONAL</code></td><td>Whether or not the <code>TEST_ID</code> is using the national lab test code system</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_VALUE</code></td><td>Value of the test measurement</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_UNIT</code></td><td>Corresponding unit for the test measurement</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_STATUS</code></td><td>Code indicating the status of the lab test measurement (<a href="#measurement_status">see value table</a>)</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>REFERENCE_RANGE_GROUP</code></td><td>Reference range for this event, as text</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>REFERENCE_RANGE_LOW_VALUE</code></td><td>Value for the low bound of the reference range</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>REFERENCE_RANGE_LOW_UNIT</code></td><td>Corresponding unit for the low bound of the reference range</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>REFERENCE_RANGE_HIGH_VALUE</code></td><td>Value for the high bound of the reference range</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>REFERENCE_RANGE_HIGH_UNIT</code></td><td>Corresponding unit for the high bound of the reference range</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>CODING_SYSTEM_ORG</code></td><td>Derived from <code>CODING_SYSTEM_OID</code></td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>CODING_SYSTEM_OID</code></td><td>Original name: <code>tutkimuskoodistonjarjestelmaid</code></td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>TEST_ID_SOURCE</code></td><td>Code of the lab test, as it appeared before preprocessing of the data</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>TEST_NAME_SOURCE</code></td><td>Short name of the lab test, as it appeared before preprocessing of the data</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_VALUE_SOURCE</code></td><td>Value of the test measurement, as it appeared before data cleaning</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>MEASUREMENT_UNIT_SOURCE</code></td><td>Unit of the test measurement, as it appeared before data cleaning</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>QC_PASS</code></td><td>Information about QC (2 unchecked, 1 pass,0 fail)</td><td align="center">✓</td><td align="center">NA</td></tr><tr><td><code>QC_NOTES</code></td><td>Explanation of QC choices and/or manipulation of data</td><td align="center">✓</td><td align="center">NA</td></tr></tbody></table>

### `QC`

* 1 - > PASS
* 0 -> FAIL
* 2 -> UNCHECKED

For all lab tests with at least 1,000 individuals with numerical measurement and GWAS covariates available (n = 433), manual outlier detection was performed. Idea is not to flag high or low values re. reference values, but only clear errors in the data; wrongly reported source units, biologically impossible values, etc. No measurements have been removed from the data, but instead we have added a column QC\_PASS to imply QC status, accompanied with QC\_NOTES giving further information about the reasong for QC failure (IMPOSSIBLE\_VALUE, WRONG\_UNIT, etc). This filter has been used in core lab GWAS. Total of 93/433 labs checked have some QC threshold. List of all QC thresholds can be found from this [table](https://github.com/FINNGEN/kanta_lab_preprocessing/blob/master/core/data/omop_qc.tsv).

In addition, if in some case positivity (OUTCOME\_POS\_EXTRACTED) and TEST\_OUTCOME columns do not agree (TEST\_OUTCOME is N (='normal') but positivity is 1 (='positive'), or TEST\_OUTCOME is A (='abnormal') but positivity is 0 (='negative'), QC\_PASS column is set to 0, and in the QC\_NOTES\
is `OUTCOME_EXTRACT_CONFLICT` . These values have not been removed from the core lab GWAS.

### `TEST_OUTCOME`

This column provides a label comparing the measured value against a reference range.

<table><thead><tr><th width="106">Value</th><th>Description</th></tr></thead><tbody><tr><td><code>N</code></td><td>Normal</td></tr><tr><td><code>A</code></td><td>Abnormal</td></tr><tr><td><code>AA</code></td><td>Very abnormal</td></tr><tr><td><code>L</code></td><td>Low</td></tr><tr><td><code>LL</code></td><td>Very low</td></tr><tr><td><code>H</code></td><td>High</td></tr><tr><td><code>HH</code></td><td>Very high</td></tr></tbody></table>

### `TEST_OUTCOME_IMPUTED`

Some rows are missing the `TEST_OUTCOME`, so an imputed one is provided. The `TEST_OUTCOME_IMPUTED` is derived by looking at the data from the same OMOP Concept ID for which there are `MEASUREMENT_VALUE` and `TEST_OUTCOME` for a minimum (100) number of entries. The process for determining the thresholds are as following.\
\
Values with both measurement (harmonized) and outcome are sorted by value, with the outcome labels sorted following the same order. E.g.

| Value | OUTCOME |
| ----- | ------- |
| 1     | L       |
| 1     | L       |
| 1.3   | N       |
| ...   | ...     |
| 7     | N       |
| 14    | H       |
| 15    | H       |

Starting from the lower end, we expect to find mostly low (L) values and then gradually find normal (N) ones. So in order to find the turnover point where Ns become the majority we define a relative measure of # of L entries/ all other entries. In ideal scenarios, this value starts at 100% and start to gradually decline as more Ns (or other entries, like A and H) start to appear. When the relative measure drops under 95% for the last time, we define the threhold there. The same is done from the opposite side with H. The summary of the thresholds used can be found in the [repo](https://github.com/FINNGEN/kanta_lab_preprocessing/blob/master/finngen_qc/data/abnormality_estimation.table.tsv).\
\
In the process we found two kind of anomalies, mainly due to an asymmetric distribution of labels:

* `+- inf` thresholds. In these cases not enough labels are present at the tails of the distribution. In the algorithm the starting thresholds are defined as such, but they never get updated as the ratio of labels never climbs above the 95% threshold to begin with. This is usually associated with lab values where there is no such thing as L/H (e.g. Triglycerides) or where the labels used are `A` instead of `H|L`
* `PROBLEM` column. This boolean column indicates when the opposite issue appears, that is we traverse the whole list of values up to the median still being above the 95% threshold and the median value is therefore used as a threshold. This indicates that there's a heavy bias in the distribution of outcome labels and thus one should proceed with caution. Values imputed with these thresholds are labelled with a `*` , e.g. `L*` or `H*`

The content of the column is as following:

<table><thead><tr><th width="103">Value</th><th>Description</th></tr></thead><tbody><tr><td><code>N</code></td><td>Imputed Normal</td></tr><tr><td><code>L</code></td><td>Imputed Low</td></tr><tr><td><code>L*</code></td><td>Imputed Low. Less confidence in the imputation due to over-representation of <code>L</code> and <code>H</code> from <code>TEST_OUTCOME</code></td></tr><tr><td><code>H</code></td><td>Imputed High</td></tr><tr><td><code>H*</code></td><td>Imputed High. Less confidence in the imputation due to over-representation of <code>L</code> and <code>H</code> from <code>TEST_OUTCOME</code></td></tr></tbody></table>

### `MEASUREMENT_STATUS`

<table><thead><tr><th width="105">Value</th><th>Description</th></tr></thead><tbody><tr><td><code>C</code></td><td>Corrected result</td></tr><tr><td><code>F</code></td><td>Final result</td></tr><tr><td><code>R</code></td><td>Unverified result</td></tr><tr><td><code>S</code></td><td>Partial result</td></tr></tbody></table>

## Pipeline

The pipeline is available in github (<https://github.com/FINNGEN/kanta_lab_preprocessing/>) where technical information on how the raw data was processed can be found.

A quick summary:

* duplicate entries are removed (based on id,date,lab test name/code/measurement status & value)
* text is processed to remove spaces and strange characters
* test national codes are mapped to names based on known mappings
* units are cleaned/uniformized and mapped to OMOP based on lab test
* units are harmonized based on OMOP IDs
* Another duplication removal step takes place post harmonization to intercept duplicate entries from different systems (checking for ID,date,harmonized test name,value and status)

## Values extraction analysis

A key aspect of the v2 kanta data has been the extraction of information from the `MEASUREMENT_FREE_TEXT` column. Here we want to explain how this took place.

### Summary

The pipeline is available in github (<https://github.com/FINNGEN/kanta_lab_preprocessing/>) where technical information on how the raw data was processed can be found.\
\
Here's a quick summary:<br>

* Harmonization:
  * basic string manipulation to remove spaces and make everything lower case
  * national test IDs are mapped to strings via THL tables
  * `CODING_SYSTEM_OID` is updated to `CODING_SYSTEM_ORG` when a mapping is available
  * Unit manipulation/injection to correct data were needed
  * OMOP mapping and numerical harmonization
* Core
  * Extraction of data from `MEASUREMENT_FREE_TEXT`
    * Where the original measurement value is missing and the free text is available, we attempt to extract numerical values from it if they match certain patterns. After some string manipulation if we're left with a pure number we cast it from string to float and is used to populate the `MEASUREMENT_VALUE_EXTRACTED` column
    * The text is scanned for pos/neg substrings and through a manual mapping, values are mapped to 1 (pos) or 0 (neg) in a new `OUTCOME_POS_EXTRACTED` column
    * The text is scanned for substrings containing the symbol `+` (e.g. `3+` , `++` etc.) and through a manual mapping that includes OMOP ID, values are mapped to 1 into `OUTCOME_POS_EXTRACTED` column. The text is copied over to `TEST_OUTCOME_TEXT_EXTRACTED`
    * The text is scanned to look for entries that indicate outcome as a comparison and are structured as such. These entries are manipulated in order to be standardized following the format `[<|>]|[VALUE]|[UNIT?]` so they can be shared safely in the `TEST_OUTCOME_TEXT_EXTRACTED` column
    * QCing takes place to remove extracted values that are formatted as dates
  * QC:
    * Through a table merged values are flagged in `QC_PASS` as
      * 1 - > PASS
      * 0 -> FAIL
      * 2 -> UNCHECKED
    * post harmonization fix that adds a threshold based conversion when (mostly in the extracted data) a mix of units exists and can be reliably addressed via a threshold based conversion. E.g. Hematocrit has both `[0,100]` and `[0,1]` measurements and it's trivial to draw the line at 1 to harmonize the data.
    * Extraction blacklist. Sometimes the previous approach is impossible because the two unit systems overlap significantly. In this case we remove the extracted data from the merged column altogether.

## Other reference tables

### Test name abbreviations

Test name abbreviations come from different laboratory testing centers around Finland. Some are standardized nationally and some are used only locally in different hospitals and test centers.

We have put a lot of effort into standardizing these to international OHDSI OMOP Concept ID (primarily from the [LOINC database](https://loinc.org/downloads/)) so we hope that you do not need to interpret them very often! However, in case you have reason to use them, we provide the meaning of most abbreviations here.

**Prefixes for lab test name abbreviations**

<table data-header-hidden><thead><tr><th width="137">Prefix</th><th>Description</th></tr></thead><tbody><tr><td>aB</td><td>Arterial blood</td></tr><tr><td>Af</td><td>Puncture fluid</td></tr><tr><td>aG</td><td>Alveolar gas</td></tr><tr><td>Am</td><td>Amniotic fluid</td></tr><tr><td>As</td><td>Ascitic fluid</td></tr><tr><td>B</td><td>Blood</td></tr><tr><td>Bf</td><td>Bronchus fluid</td></tr><tr><td>Bi</td><td>Bile</td></tr><tr><td>Bl</td><td>Bronchoalveolar lavation</td></tr><tr><td>Bm</td><td>Bone Marrow</td></tr><tr><td>Bo</td><td>Bone</td></tr><tr><td>Br</td><td>Breast</td></tr><tr><td>Bu</td><td>Bursa</td></tr><tr><td>Ca</td><td>Cannula/IV port</td></tr><tr><td>cB</td><td>Capillary blood</td></tr><tr><td>Cf</td><td>Cervix fluid</td></tr><tr><td>Cn</td><td>Central nervous system</td></tr><tr><td>cU</td><td>Collected urine</td></tr><tr><td>Cv</td><td>Choroid villus</td></tr><tr><td>Di</td><td>Dialysis fluid</td></tr><tr><td>Dj</td><td>Duodenal juice</td></tr><tr><td>dU</td><td>Diurnal urine</td></tr><tr><td>E</td><td>Erythrocyte</td></tr><tr><td>Ex</td><td>Sputum</td></tr><tr><td>F</td><td>Fecal</td></tr><tr><td>fB</td><td>Fasting blood</td></tr><tr><td>Fl</td><td>Vaginal fluor</td></tr><tr><td>fP</td><td>Fasting plasma</td></tr><tr><td>fS</td><td>Fasting serum</td></tr><tr><td>Gi</td><td>Gastrointestinal</td></tr><tr><td>Gj</td><td>Gastric juice</td></tr><tr><td>Hb</td><td>Hemoglobin</td></tr><tr><td>He</td><td>Heart</td></tr><tr><td>Ki</td><td>Kidney</td></tr><tr><td>L</td><td>Leukocytes</td></tr><tr><td>Lf</td><td>Lacrimal fluid</td></tr><tr><td>Li</td><td>Likvor/CSF</td></tr><tr><td>Ln</td><td>Lymph Node</td></tr><tr><td>Lr</td><td>Liver</td></tr><tr><td>Lu</td><td>Lung</td></tr><tr><td>Ly</td><td>Lymphocytes</td></tr><tr><td>M</td><td>Muscle</td></tr><tr><td>mB</td><td>Machine blood</td></tr><tr><td>Me</td><td>Meconium</td></tr><tr><td>Mf</td><td>Mammary fluid</td></tr><tr><td>Mm</td><td>Maternal milk</td></tr><tr><td>Mu</td><td>Mucosa</td></tr><tr><td>Ne</td><td>Nerve</td></tr><tr><td>Ns</td><td>Nasal secretion</td></tr><tr><td>nU</td><td>Nocturnal urine</td></tr><tr><td>P</td><td>plasma</td></tr><tr><td>Pd</td><td>Peritoneal dialysis</td></tr><tr><td>Pf</td><td>Pleura</td></tr><tr><td>Pi</td><td>Pituitary gland</td></tr><tr><td>Pl</td><td>Placenta</td></tr><tr><td>Pp</td><td>Periodontal pocket</td></tr><tr><td>Ps</td><td>Pharyngeal secretion</td></tr><tr><td>Pt</td><td>Patient</td></tr><tr><td>Pu</td><td>Pus</td></tr><tr><td>S</td><td>Serum</td></tr><tr><td>Sa</td><td>Saliva</td></tr><tr><td>Se</td><td>Secretion</td></tr><tr><td>Sk</td><td>Skin</td></tr><tr><td>Sp</td><td>Semen</td></tr><tr><td>Sw</td><td>Sweat</td></tr><tr><td>Sy</td><td>Syncytial fluid</td></tr><tr><td>T</td><td>Thrombocyte</td></tr><tr><td>Ts</td><td>Tissue</td></tr><tr><td>Tu</td><td>Tumor</td></tr><tr><td>U</td><td>Urine</td></tr><tr><td>uA</td><td>Umbilical arterial blood</td></tr><tr><td>Ug</td><td>Urogenital</td></tr><tr><td>uS</td><td>Umbilical serum</td></tr><tr><td>uV</td><td>Umbilical venous blood</td></tr><tr><td>vB</td><td>Venous blood</td></tr><tr><td>W</td><td>Water</td></tr></tbody></table>

**Suffixes for lab test name abbreviations**

<table data-header-hidden><thead><tr><th width="138"></th><th></th></tr></thead><tbody><tr><td>-Ab</td><td>Antibody</td></tr><tr><td>-AbA</td><td>IgA antibody</td></tr><tr><td>-AbE</td><td>IgE antibody</td></tr><tr><td>-AbG</td><td>IgG antibody</td></tr><tr><td>-AbM</td><td>IgM antibody</td></tr><tr><td>-Ag</td><td>Antigen</td></tr><tr><td>-Akt</td><td>Activity</td></tr><tr><td>-Aktt</td><td>Activation products</td></tr><tr><td>-Cl</td><td>Clearance</td></tr><tr><td>-Ct</td><td>Control</td></tr><tr><td>-D</td><td>DNA</td></tr><tr><td>-Di</td><td>Dialysis</td></tr><tr><td>-EVi</td><td>Special culture</td></tr><tr><td>-EM</td><td>Electron Microscopic</td></tr><tr><td>-F</td><td>Fetal</td></tr><tr><td>-Fc</td><td>Flow cytometry</td></tr><tr><td>-Fr</td><td>Fraction</td></tr><tr><td>-Gr</td><td>Gestational</td></tr><tr><td>-IF</td><td>Immunofluorescence</td></tr><tr><td>-IH</td><td>Immunohistochemistry</td></tr><tr><td>-Ind</td><td>Index</td></tr><tr><td>-Ion</td><td>Ionized</td></tr><tr><td>-Is</td><td>Iso enzymes</td></tr><tr><td>-ISH</td><td>in situ -hybridisation</td></tr><tr><td>-Jtk</td><td>Follow-up study</td></tr><tr><td>-Jvi</td><td><p>Follow-up culture</p><p>(jatkoviljely)</p></td></tr><tr><td>-Kj</td><td>Conjugate</td></tr><tr><td>-Lm</td><td>Species specificity</td></tr><tr><td>-MS</td><td>Mass spectrometry</td></tr><tr><td>-Nh</td><td>Nucleic acid</td></tr><tr><td>-O</td><td>Qualitative</td></tr><tr><td>-Oc</td><td>Oligoclonal</td></tr><tr><td>-Pa</td><td>Long term</td></tr><tr><td>-Pse</td><td>Screening and categorization</td></tr><tr><td>-PT</td><td>Rapid test</td></tr><tr><td>-R</td><td>Exercise stress test</td></tr><tr><td>-S</td><td>Stimulation</td></tr><tr><td>-Sc</td><td>Sub classes</td></tr><tr><td>-Ty</td><td>Typing</td></tr><tr><td>-V</td><td>Free or unconjugated</td></tr><tr><td>-Vi</td><td>Microbiology culture (e.g. u-Baktvi = bacterial culture from urine, ps-stravi = Strep A culture in pharayngeal secretion, F-sienVi = fungal culture in stool)</td></tr><tr><td>-Vit</td><td>Vitamine</td></tr><tr><td>-Vr</td><td>Staining</td></tr><tr><td>-Vt</td><td>Point of care (vieritesti), often a rapid test</td></tr></tbody></table>

### Reference range terms

Test reference ranges are a free text string that can have a lot of Finnish in them. Below can be found a list of translations of the most common words seen in reference ranges:

**General terms:**

* AIKUISET: Adults
* ALLE: Under/Below
* ALK: Abbreviation for "alkaen", meaning "starting from" or "beginning at"
* ALTISTUMATTOMAT: Unexposed (individuals)
* AAMUNÄYTE: Morning sample
* EDELLEEN: Still, continuing
* FERTIILI-IKÄ: Fertile age
* HOITOALUE: Treatment range
* JA: And
* JÄÄNNÖSPIT: Residual concentration
* KAIKKI: All, everyone
* KATSO: See, look at
* KK: Abbreviation for "kuukausi", meaning month
* KS: Abbreviation for "katso", meaning "see" or "look at"
* KTS: Another abbreviation for "katso"
* KYMENLAAKSONLAB: Kymenlaakso Laboratory (a specific lab in Finland)
* LAPSET: Children
* LEUK: Leukocytes (white blood cells)
* LIER: Likely referring to "lieriöt", meaning casts (in urine analysis)
* MIEHET: Men
* NAISET: Women
* NEGAT: Negative
* NORMAALI: Normal
* OHJEKIRJA: Manual, guidebook
* PAASTO: Fasting
* POJAT: Boys
* POSTMENOPAUSSI: Postmenopausal
* PREMENOPAUSAALISET: Premenopausal
* PUBERT: Puberty
* RASKAUS: Pregnancy
* SUOSITELTAVA: Recommended
* TAVOITE: Target, goal
* TAVOITEARVO: Target value
* TERAP: Therapeutic
* TOKSINEN: Toxic
* TULKINTA: Interpretation
* TUPAKOIMATTOMAT: Non-smokers
* TYTÖT: Girls
* V: Abbreviation for "vuosi", meaning year
* VASTASYNT: Newborn
* VIITEARVO: Reference value
* VKO: Abbreviation for "viikko", meaning week
* VRK: Abbreviation for "vuorokausi", meaning day (24-hour period)
* YLI: Over, above

**Age-related terms:**

* 0-6PV: 0-6 days
* 1KK-1V: 1 month to 1 year
* 1V-: 1 year and older
* 2-4V: 2-4 years
* 5-10V: 5-10 years
* 11-15V: 11-15 years
* 16V-: 16 years and older
* 18V-: 18 years and older

**Medical terms:**

* ERYT: Erythrocytes (red blood cells)
* EPIT.SOLUT: Epithelial cells
* FOLLIKK.VAIHE: Follicular phase (of menstrual cycle)
* MAKUU: Lying down (usually referring to blood pressure measurement)
* MENARKEA: Menarche (first menstrual period)
* PYSTY: Standing (usually referring to blood pressure measurement)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/kanta-lab-values/data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
