FinnGen Data Freezes and Releases
During the active sample collection phase during FinnGen 1 and 2, the Data Freeze (DF) happened twice a year in February and August. At these times, FinnGen produced Data Release (R) of updated genotype and phenotype data files to the Sandbox.
During FinnGen 3 we no longer increase the number of samples, but update the health register data. FinnGen 3 data releases are scheduled for the end of February 2025 (R13), February 2026 (R14), and February 2027 (R15).
Note on data availability after release:
When new genotype and phenotype files are first released, please be aware of the following timeline:
Month 1: The analysis team generates covariate files needed for GWAS. You will not be able to run GWAS until these are ready.
Months 1-3: The register and phenotype teams update all other register data (not included in the detailed longitudinal data and endpoints) and convert these to OMOP format.
Month 3: The analysis team releases a set of so-called core analysis results to the Google Cloud storage green bucket (
gs://finngen-production-library-green). These analyses include, but are not limited to, GWAS summary statistics, fine-mapping, colocalization, and autoreporting results.
Bottom line: It takes approximately 2-3 months after the initial release before all data and analysis tools are fully updated.
All data and core analysis result releases are announced via the finngen-accounts mailing list and at the FinnGen Science and Users' meetings. You can also find the releases in the Handbook Release notes.
Data Freeze/Release schedule
DF1/R1
52,295
-
-
-
-
-
DF2/R2
102,739
96,499
1,485
1,485
Q4 2018
Q1 2020
DF3/R3
146,630
135,638
2,737
1,801
Q2 2019
Q2 2020
DF4/R4
183,694
176,899
3,452
2,444
Q4 2019
Q4 2020
DF5/R5
224,737
218,792
3,858
2,803
Q2 2020
Q2 2021
DF6/R6
271,341
260,405
3,995
2,861
Q3 2020
Q3 2021
DF7/R7
321,464
309,154
4,149
3,095
Q1 2021
Q1 2022
DF8/R8
356,213
342,499
4,431
2,202
Q3 2021
Q3 2022
DF9/R9
392,649
377,277
4,526
2,272
Q1 2022
Q2 2023
DF10/R10
430,897
412,181
4,519
2,408
Q3 2022
Q4 2023
DF11/R11
473,681
453,733
4,415
2,444
Q1 2023
Q2 2024
DF12/R12
520,210
500,348
4,421
2,469
Q1 2024
Q2 2024
DF14/R13
519,972
500,186
4,662
2,466
Q1 2025
~Q2 2026
DF14/R14
TBD
TBD
TBD
TBD
Q1 2026
~Q2 2027
DF15/R15
TBD
TBD
TBD
TBD
Q1 2027
~Q2 2028
[1] total endpoint definitions [2] endpoints used for core GWAS and PheWAS.
Number of individuals with genotypes and phenotypes in FinnGen Data Releases. Starting from R5 the phenotype data has been filtered by genotyped individuals. Endpoint data includes individuals who have baseline data. Detailed longitudinal data includes only those individuals who have register data available. Some individuals in the phenotype data have been removed in QC steps. Data files can be found from Sandbox: /finngen/library-red/
Following data files are released to Sandbox per each Data Release
other registry data files (periodically updated and released to Sandbox)
Core Analysis Results files (released to FinnGen Production Library Green per each Data Release)
Here is the expected schedule for the next data freeze file releases.
Last updated
Was this helpful?