FinnGen Data Freezes and Releases

During the active sample collection phase (years 2017-2023), the Data Freeze (DF) happened twice a year in February and August. At these times, FinnGen produced Data Release (R) of updated genotype and phenotype data files to the Sandbox.

During FinnGen 3 (Aug 2023 onwards) the sample size is fixed and Data Releases focus on updating health register data. Three releases are planned: R13 (February 2025), R14 (February 2026), and R15 (February 2027).

Following data files are released to Sandbox per each Data Freeze/Release

All data and core analysis result releases are announced via the finngen-accounts mailing list and at the FinnGen Science and Users' meetings. You can also find the releases in the Handbook Release notes.

Note on data availability after release:

When new genotype and phenotype files are first released, please be aware of the following timeline:

  • Month 1: The analysis team generates covariate files needed for GWAS. You will not be able to run GWAS until these are ready.

  • Months 1-3: The register and phenotype teams update all other register data (not included in the detailed longitudinal data and endpoints) release them to Sandbox. Most of these are also converted to OMOP format and made available to the various analysis tools in the Sandbox.

  • Month 3: The analysis team releases a set of so-called core analysis results to the Google Cloud storage green bucket (gs://finngen-production-library-green). These analyses include, but are not limited to, GWAS summary statistics, fine-mapping, colocalization, and autoreporting results.

Bottom line: It takes approximately 2-3 months after the initial release before all data and analysis tools are fully updated.

FinnGen Data Freezes/Releases

Release
Total sample size
Total samples in core GWAS analyses
Total endpoints
Core endpoints
Date release to partners
Results released publicly

DF1/R1

52,295

-

-

-

-

-

DF2/R2

102,739

96,499

1,485

1,485

Q4 2018

Q1 2020

DF3/R3

146,630

135,638

2,737

1,801

Q2 2019

Q2 2020

DF4/R4

183,694

176,899

3,452

2,444

Q4 2019

Q4 2020

DF5/R5

224,737

218,792

3,858

2,803

Q2 2020

Q2 2021

DF6/R6

271,341

260,405

3,995

2,861

Q3 2020

Q3 2021

DF7/R7

321,464

309,154

4,149

3,095

Q1 2021

Q1 2022

DF8/R8

356,213

342,499

4,431

2,202

Q3 2021

Q3 2022

DF9/R9

392,649

377,277

4,526

2,272

Q1 2022

Q2 2023

DF10/R10

430,897

412,181

4,519

2,408

Q3 2022

Q4 2023

DF11/R11

473,681

453,733

4,415

2,444

Q1 2023

Q2 2024

DF12/R12

520,210

500,348

4,421

2,469

Q1 2024

Q2 2024

DF14/R13

519,972

500,186

4,662

2,466

Q1 2025

~Q2 2026

DF14/R14

TBD

TBD

TBD

TBD

Q1 2026

~Q2 2027

DF15/R15

TBD

TBD

TBD

TBD

Q1 2027

~Q2 2028

[1] total endpoint definitions [2] endpoints used for core GWAS and PheWAS.

Number of individuals with genotypes and phenotypes in FinnGen Data Releases. Starting from R5 the phenotype data has been filtered by genotyped individuals.

Last updated

Was this helpful?