# Expansion Area 5 proteomics data

The EA5 pilot was done in collaboration with the Blood Service Biobank, and the samples were collected from blood donors who were heterozygous or homozygous carriers of specific genetic variants of interest. The list of those variants of interest was suggested and approved by the Steering Committee at the beginning of the study and can be found [**here**](https://tt.eduuni.fi/sites/hy-finngen/All/_layouts/15/WopiFrame.aspx?sourcedoc=%7b0192BBDE-A48A-4E31-91B1-FCE8ACD240C4%7d\&file=R7_EA5_NEA3_FINAL_SCORE_09_2021.xlsx\&action=default).

The data were generated as part of the FinnGen 2 Expansion Area 5 (EA5) pilot. For the analysis, we used Olink Explore 3072 and SomaScan 7K platforms.

## Olink

**Olink batches 1, 2, and 3**

* Total number of samples is 1,990 and it contains the data and analysis of the Olink batches 1, 2 and 3.
  * Batch 1: Rare variant carriers (healthy blood donors)
  * Batch 2.1: Rare variant carriers (healthy blood donors)
  * Batch 2.2: Twins (selected from Twingen)
  * Batch 3: Rare variant carriers (GeneRisk cohort)
* **The raw data:**
  * `Batch 1: finngen/library-red/EA5/proteomics/olink/first_batch/original_data/`
  * `Batch 2.1: finngen/library-red/EA5/proteomics/olink/second_batch/original_data/`
  * `Batch 2.2: finngen/library-red/EA5/proteomics/olink/second_batch/original_data/`
  * `Batch 3: finngen/library-red/EA5/proteomics/olink/third_batch/original_data/`
* **QCed data:** `finngen/library-red/EA5/proteomics/olink/third_batch/QCd/proteomics_QC_all.txt`
* pqtl, finemap and auto reporting results are at `gs://finngen-production-library-green/omics/proteomics/release_2023_10_11/`

**Sample timestamps:** In case a sample timestamp is needed (i.e., when was the sample collected), you can use the files below to screen for your samples.

* **Healthy blood donors:**
  * `/finngen/library-red/ EA5/omics_metadata/20250204_EA5_Plasma_Metadata_All.csv`

    `/finngen/library-red/EA5/omics_metadata/`[`20250204_EA5_Plasma_Metadata_Readme.md`](https://urldefense.com/v3/__http:/20250204_EA5_Plasma_Metadata_Readme.md__;!!H9nueQsQ!48BpnWXYE5YCnO93PXnwlx93bkDCAF3BJvdUp1OQ2yzLFIfQN4zuYrxO8gk4Flt8MQwNUkcTL1xZkTM8voX6Q8nTR18UzvwX$)
* **Twins:**
  * `/finngen/library-red/EA5/proteomics/olink/second_batch/finngen_twins_EA5_metadata_1.0.txt`
  * `/finngen/library-red/EA5/proteomics/olink/second_batch/finngen_twins_EA5_metadata_readme_1.0.txt`
* **GeneRisk:**
  * We do not have the exact timestamps as with the other samples for this cohort, but you can check in R12 minimum phenotype data for their BL\_AGE (age at DNA sample) and use that because the plasma was collected at the same time (see description in handbook [https://docs.finngen.fi/finngen-data-specifics/red-library-data-individual-level-dat\[…\]-available-in-sandbox-1/minimum-and-minimum-longitudinal-data](https://docs.finngen.fi/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/minimum-and-minimum-longitudinal-data))

**Olink batch 1 and 2**

* Total number of samples is 1,534 and it contains the data and analysis from the batches 1 (n=813) and 2 (n=721).
* After QC and ID mapping the number of individuals in the analysis is 1,243.
* Raw proteomics data of batch 2 and combined batch 1+2:
  * f`inngen/library-red/EA5/proteomics/olink/second_batch/`
* pQTL and finemapping results:
  * `library-green/omics/proteomics/release_2023_08_08/data/Olink/Finemap/`
  * `library-green/omics/proteomics/release_2023_08_08/data/Olink/pQTL/`
* Colocalization results:
* `gs://finngen-production-library-green/finngen_R11/finngen_R11_analysis_data/colocalization/data/fg_r11_Olink_batch1_and_2.txt.gz`
* Explanation of the result columns can be found from ​[Olink data](/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/proteomics-results.md)

## **SomaScan**

* Total number of samples analyzed with SomaScan is 1,000 individuals. These are the same individuals from which [metabolomics](/finngen-data-specifics/red-library-data-individual-level-data/omics-data/metabolomics.md) data was also generated.
* Raw data:
  * `library-red/EA5/proteomics/soma/second_batch/QCd/SOMA_batch_1_2_QC_all_v2.tx`t
* pQTL and finemapping results:
  * `library-green/omics/proteomics/release_2023_08_08/data/Somascan/Finemap/`
  * `library-green/omics/proteomics/release_2023_08_08/data/Somascan/pQTL/`
* Colocalization results:
  * `library-green/finngen_R11/finngen_R11_analysis_data/colocalization/data/fg_r11_Soma_v2.txt.gz`
* Readme:
  * `gs://finngen-production-library-green/omics/proteomics/release_2023_08_08/readme.txt`

Link to explanation of result file [SomaScan data](/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/proteomics-results.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/finngen-data-specifics/red-library-data-individual-level-data/omics-data/proteomics/expansion-area-5-proteomics-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
