FinnGen Data Freezes and Releases

During the active sample collection phase during FinnGen 1 and 2, the Data Freeze (DF) happened twice a year in February and August. At these times, FinnGen produced Data Release (R) of updated genotype and phenotype data files to the Sandbox.

During FinnGen 3 we no longer increase the number of samples, but update the health register data. FinnGen 3 data releases are scheduled for the end of February 2025 (R13), February 2026 (R14), and February 2027 (R15).

Note on data availability after release:

When new genotype and phenotype files are first released, please be aware of the following timeline:

  • Month 1: The analysis team generates covariate files needed for GWAS. You will not be able to run GWAS until these are ready.

  • Months 1-3: The register and phenotype teams update all other register data (not included in the detailed longitudinal data and endpoints) and convert these to OMOP format.

  • Month 3: The analysis team releases a set of so-called core analysis results to the Google Cloud storage green bucket (gs://finngen-production-library-green). These analyses include, but are not limited to, GWAS summary statistics, fine-mapping, colocalization, and autoreporting results.

Bottom line: It takes approximately 2-3 months after the initial release before all data and analysis tools are fully updated.

All data and core analysis result releases are announced via the finngen-accounts mailing list and at the FinnGen Science and Users' meetings. You can also find the releases in the Handbook Release notes.

Data Freeze/Release schedule

Release
Total sample size
Total samples in core GWAS analyses
Total endpoints
Core endpoints
Date release to partners
Results released publicly

DF1/R1

52,295

-

-

-

-

-

DF2/R2

102,739

96,499

1,485

1,485

Q4 2018

Q1 2020

DF3/R3

146,630

135,638

2,737

1,801

Q2 2019

Q2 2020

DF4/R4

183,694

176,899

3,452

2,444

Q4 2019

Q4 2020

DF5/R5

224,737

218,792

3,858

2,803

Q2 2020

Q2 2021

DF6/R6

271,341

260,405

3,995

2,861

Q3 2020

Q3 2021

DF7/R7

321,464

309,154

4,149

3,095

Q1 2021

Q1 2022

DF8/R8

356,213

342,499

4,431

2,202

Q3 2021

Q3 2022

DF9/R9

392,649

377,277

4,526

2,272

Q1 2022

Q2 2023

DF10/R10

430,897

412,181

4,519

2,408

Q3 2022

Q4 2023

DF11/R11

473,681

453,733

4,415

2,444

Q1 2023

Q2 2024

DF12/R12

520,210

500,348

4,421

2,469

Q1 2024

Q2 2024

DF14/R13

519,972

500,186

4,662

2,466

Q1 2025

~Q2 2026

DF14/R14

TBD

TBD

TBD

TBD

Q1 2026

~Q2 2027

DF15/R15

TBD

TBD

TBD

TBD

Q1 2027

~Q2 2028

[1] total endpoint definitions [2] endpoints used for core GWAS and PheWAS.

Number of individuals with genotypes and phenotypes in FinnGen Data Releases. Starting from R5 the phenotype data has been filtered by genotyped individuals. Endpoint data includes individuals who have baseline data. Detailed longitudinal data includes only those individuals who have register data available. Some individuals in the phenotype data have been removed in QC steps. Data files can be found from Sandbox: /finngen/library-red/

Following data files are released to Sandbox per each Data Release

Here is the expected schedule for the next data freeze file releases.

Last updated

Was this helpful?