LogoLogo
FinnGen Handbook
  • Introduction
  • Where to begin
    • Quick guides
      • New to FinnGen
      • Green data users
      • Red data users
    • I'm new to FinnGen, where is the best place for me to start?
    • What kind of questions can I ask of FinnGen data?
    • How do I make a custom endpoint?
    • How do I run a GWAS of a phenotype I created myself?
    • I'm interested in FinnGen rare variant phenotypes
  • Background Concepts
    • Basics of Genetics
    • Linkage Disequilibrium (LD)
    • Genotype Imputation
    • Genotype Data Processing and Quality Control (QC)
    • GWAS Analysis
    • P Values
    • Heritability and genetic correlations
    • Finemapping
    • Conditional analysis
    • Colocalization
    • Using Polygenic Risk Scores
    • PheWAS analysis
    • Survival analysis
    • Longitudinal Data Analysis
    • GWAS Association to Biological Function
    • Genetic Data Resources outside FinnGen
    • Getting Started with Unix
    • Getting Started with R
    • Structure of the FinnGen project
    • Finnish gene pool and health register data
  • FinnGen Data Specifics
    • FinnGen Data Freezes and Releases
    • Analysis proposals
      • What is a FinnGen analysis proposal and when do I need to submit one?
      • How do I submit an analysis proposal?
      • How are analysis proposals handled?
      • What is a FinnGen bespoke analysis proposal and when do I need to submit one?
      • How do I submit a bespoke analysis proposal?
      • How are bespoke analysis proposals handled?
      • What is the difference between FinnGen analysis proposals and FinnGen bespoke analyses?
      • Existing analysis proposals
    • Finnish Health Registries and Medical Coding
      • Finnish health registries
      • Register data pre-processing
      • Data Masking/Blurring of Visit Dates
      • International and Finnish Health Code Sets
      • More information on health code sets
      • VNR code mapping to RxNorm
      • Register code translation files
    • Endpoints
      • FinnGen clinical endpoints
      • History of creating the FinnGen endpoints
      • Location of FinnGen Endpoint and Control Description Files
        • What's new in DF13 endpoints
        • What’s new in DF12 endpoints
        • What’s new in DF11 endpoints
        • What’s new in the DF10 endpoints
        • What’s new in DF9 endpoints
        • What’s new in DF8 endpoints
      • Interpretation of Endpoint Definition file
      • Location of Endpoint Quality Control Report
      • Creating a User-defined Endpoint(s)
      • Requesting a User-defined Endpoint to be included in Core Analysis
      • Complete follow-up time of the FinnGen registries – primary endpoint data
        • Survival analysis using the truncated endpoint file – secondary endpoint data
    • Biobanks in Finland
    • Publishing FinnGen results
      • Preparing manuscripts or conference abstracts
      • The 1-year “Exclusivity Period” Policy
      • List of Publications using FinnGen Data
      • How to share GWAS summary statistics with FinnGen community
      • How to publish GWAS summary statistics
      • Public Result Releases
    • Red Library Data (individual level data)
      • Genotype data
        • Genotype Arrays Used
          • Legacy cohorts and chips
        • Imputation Panel
          • Sisu v4 reference panel
          • Sisu v3 reference panel
          • Sisu v4.2 reference panel
            • Variant-wise QC metrics file
        • Genome build used in FinnGen
        • Genotype Data Processing Flow
        • Genotype Files in Sandbox
          • Imputed genotypes in VCF format
          • Imputed genotypes in BGEN format
          • Imputed genotypes in PLINK format
          • Chip data
          • Imputed HLA alleles
          • Principal components analysis (PCA) data
          • Kinship data
          • Analysis covariates
          • Polygenic risk scores (PRS)
          • Genetic Ancestry
          • Genetic relationships (GRM)
          • Mosaic chromosomal alterations (mCA)
          • Prune data (R9)
          • Imputed STR genotypes (R8)
      • Phenotype data
        • Register data
        • Detailed longitudinal data
          • Splitting combination codes in detailed longitudinal data
        • Service sector data
          • Service sector data code translations
        • Endpoint and endpoint longitudinal data
        • Kanta lab values
          • Data
          • FAQ
          • How-to guides
        • Kanta prescriptions
        • Minimum extended phenotype data
          • Extracting minimum phenotype data per biobank
          • DNA isolation protocols per biobank
        • Minimum longitudinal data
        • Minimum phenotype data (before R11)
        • Cohort data (before R11)
        • Other register data files in Sandbox
          • Register of Congenital Malformations
          • Finnish Registry for Kidney Diseases
          • Reproductive history data
          • Finnish Cancer Registry: Cervical cancer screening
          • Finnish Cancer Registry: Breast cancer screening
          • Finnish Cancer Registry: Detailed cancer data
          • Finnish Register of Visual Impairment
          • Parental cause of death data
          • Ejection fraction data
          • Finnish National Infectious Disease Register
          • Finnish National Vaccination Register
          • Covid-19 primary care data
          • Blood donor data from the Finnish Red Cross Blood Service (FRCBS)
          • Dental data
          • Socioeconomic data
          • Hilmo and avohilmo extended data
      • Omics data
        • Proteomics
          • Expansion Area 5 proteomics data
          • FinnGen 3 proteomics data
        • Metabolomics
        • Single-cell transcriptomics and immune profiling
        • High-content cell imaging
        • Full blood counts and clinical chemistry
      • Hospital administered medications
      • Whole exome sequencing (WES) data
    • Green Library Data (aggregate data)
      • What is "Green" Data?
      • Accessing Green Data
      • Other analyses available
        • Colocalizations in FinnGen
        • Autoreporting – information on overlaps
          • Index of Autoreporting variables
        • HLA
        • LoF burden test
        • Meta-analyses
      • Core analysis results files
        • Recessive GWAS results format
        • Variant annotation file format
        • Genotype cluster plots format
        • GWAS results format
        • Finemapping results format
        • Colocalization results format
          • Results format in colocalization before DF13
        • Autoreporting results format
        • Sex-specific GWAS results format
        • UKBB-FinnGen meta-analysis file formats
        • Pairwise endpoint genetic correlation format
        • Heritabilities
        • Coding variant associations format
        • HLA association results
        • Proteomics results
        • Coding variant results including CHIP EWAS (Exome-Wide Association Scan)
        • Kanta lab association results v1
    • Disease specific Task Force data
      • Inflammatory bowel disease (IBD) SNOMED codes data
    • Expansion Area 3 (EA3) studies
      • EA3 study: Fatty liver disease study and data in Sandbox
      • EA3 study: Age-related macular degeneration study and data in Sandbox
      • EA3 study: Women's health studies
        • EA3 study: Women’s health – Endometriosis and data in Sandbox
        • EA3 study: Human papilloma virus-related gynecological lesions, and data in Sandbox
        • EA3 study: Women’s health – PCOS and infertility study, and data in Sandbox
      • EA3 study: Diabetic Kidney Disease and Rare Kidney Disease study and data in Sandbox
      • EA3 study: Oncology studies
        • EA3 study: Oncology – Breast cancer study and data in Sandbox
        • EA3 study: Oncology –Prostate cancer study and data in Sandbox
        • EA3 study: Oncology – Ovarian cancer study and data in Sandbox
      • EA3 study: Pulmonary diseases (IPF, asthma and COPD) study and data in Sandbox
      • EA3 study: Immune-mediated diseases
      • EA3 study: Heart Failure study and data in Sandbox
      • FinnGen EA3 leads
  • Disease Specific Task Forces
    • Inflammatory bowel disease (IBD)
    • Kidney Diseases
    • Eye Diseases
    • Rheumatic Diseases
    • Atopic Dermatitis
    • Pulmonary Diseases
    • Neurological Diseases
    • Heart Failure
    • Fibrotic Diseases
    • Metabolic diseases
    • Parkinson's diseases
  • Working in the Sandbox
    • How to get started with Sandbox
    • What is Sandbox and what can you do there
    • What do we mean by "red" and "green" data?
    • General workflows for the most common analyses
    • Quirks and Features
      • Managing your files in Sandbox
      • Navigating the Sandbox
      • How to save Sandbox window configuration
      • Copying and pasting in and out of your IVM
      • How to report issues from within the Sandbox
      • Sharing individual-level data within the Sandbox
      • How to download results from your IVM
        • Sandbox download requests – rules and examples for minimum N
      • Keyboard combinations
      • Running analyses in your IVM vs. Pipelines
      • Timeouts and saving your work (backups, github)
      • How to install a R package into Sandbox?
        • How to install R packages with many dependencies
      • Install R and Python packages from the local Sandbox repository
      • How to install a Python package into Sandbox
      • How to install GNU Debian package
      • How to upload your own files to IVM via /finngen/green
      • How to remove files from /finngen/green
      • Using Sandbox as a Chrome application (full screen mode)
      • How to reset your finngen.fi account password
      • Sandbox IVM tool request handling policy
      • Docker images
        • How to get a new Docker image to Sandbox
        • How to mount data into Docker container image
        • Containers available to Sandbox
        • Containers with user customized tool sets
        • How to write a Docker file
        • Anaconda Python environment in the Sandbox
      • Python Virtual Environment in Sandbox
      • How to shut down your IVM
    • Which tools are available?
      • FinnGen exome query tool
      • Custom GWAS tools
        • Custom GWAS GUI tool
        • Custom GWAS command line (CLI) tool
          • Custom GWAS CLI Binary mode
          • Custom GWAS CLI Quantitative mode
        • How to make your summary stats viewable in a PheWeb-style?
        • Finemapping of Custom GWAS analyses
        • PheWeb Users Input Validator tool
        • Conditional analysis of Custom GWAS analyses
      • Pipelines
      • Pre-installed Linux tools
      • PGS Browser
      • Lmod Linux tools
      • Anaconda Python module with ready set of scientific packages
      • Python packages
      • R packages
      • Atlas
        • Quick guide
          • Introduction to OHDSI, OMOP CDM and Atlas
          • From research question to concepts and cohort building
          • Using Atlas in Sandbox
          • Examples on cohort building with Atlas
        • Detailed guide
          • Atlas data model
          • Standard and non-standard codes
          • How to define a cohort in Atlas
            • Select FinnGen data release in Atlas for Search
            • How to define a simple ICD case-control cohort in Atlas
              • Define a simple ICD Concept Set in Atlas
              • Define a simple ICD case cohort in Atlas
              • Define a simple ICD control cohort in Atlas
            • Concept Sets
              • Create Concept Sets using descendants
              • Exclude and Remove codes from Concept Set
              • Simplify Concept Sets that use standard code descendants
              • Create Concept Sets using equivalent standard and non-standard codes
              • View standard code hierarchy in Atlas
            • Cohort Definitions
              • Using the Death register in Atlas
              • Filtering by clinical registries in Atlas
              • Filtering by demographic criteria in Atlas
              • Defining exit rules for a cohort in Atlas
              • Selecting the correct box in Atlas for events and medical codes
            • How to export FinnGen IDs from Atlas
          • Downstream analyses after the Atlas cohorts are created
          • Data Release Summary Statistics in Atlas
          • Cohort Summary Statistics in Atlas
            • Time-dependent Cohort Summary Statistics in Atlas
            • Event inclusion in Cohort Summary Statistics in Atlas
          • Cohort Pathways
      • BigQuery (relational database)
      • Atlas vs BigQuery cohorts
      • Genotype Browser
      • Cohort Operations tool (CO)
        • Upload cohorts to CO
        • Combine cohorts with CO
        • Operate on Atlas cohorts and data with entries and exit events
        • Explore code and endpoint enrichments with CO (CodeWAS)
        • Explore endpoint overlaps with CO
        • Compare custom endpoint to FinnGen endpoint with CO
        • Launch custom GWAS with CO
        • Export FinnGen IDs using CO
        • Understanding phenotypic overlaps using CO
      • Trajectory Visualization Tool (TVT)
        • Running TVT
          • Filtering timelines with TVT
          • Reordering timelines with TVT
          • Clustering timelines with TVT
          • Viewing TVT results
        • Viewing Atlas, CO, and Genotype cohorts in TVT
        • Exporting cohorts from TVT
        • TVT help page
      • LifeTrack
      • Miscellaneous helper scripts/tools
        • Tool to annotate variants with RSIDs
        • Proper translations of medical, service sector and provider codes
        • BigQuery Connection – R
          • Case study – All register data for a person
          • Case study – UpSet plot
          • Case study – Tornado plot
          • Case study – defining simple cohorts using medical codes for running case-control GWAS
        • BigQuery Connection - Python
          • BigQuery Python - Downstream analysis - Active Ingredient - Bar plot
          • BigQuery Python - Case Study - Sex different - Tornado plot
          • BigQuery Python - Case Study - Comorbidity - Upset plot
          • BigQuery Python - Case Study - Patient Timeline - Scatter plot
      • Sandbox internal API for software developers
    • Working with Phenotype Data
      • Variant PheWas
      • How to select controls for your cases
      • Using the R libraries to look at Phenotype data
      • How to check case counts from the data
      • Creating your own user-defined endpoint
    • Working with Genotype Data
      • Genotype Browser how to
      • Cluster Plots
      • ClusterPlot viewer V3C
      • Rare Variant Calling in V3C
      • Create map of allele
      • Genotypes from VCF files
      • Variant PheWas
      • Interpreting rare-variant analysis results
      • Tools for geno-pheno explorations
        • Example: transferring data from Genotype Browser to LifeTrack
        • Example: Visualizing Genotype Browser output data with TVT
    • Running analyses in Sandbox
      • How to run survival analyses
      • How to create custom endpoint using bigquery: example
      • How to use the Pipelines tool
      • How to submit a pipeline from the command line (finngen-cli)
      • How to run genome-wide association studies (GWAS)
        • How to run GWAS using REGENIE
        • Running quantitative GWAS with REGENIE
        • Conditional analysis
        • Conditional Analysis with custom regions and loci
        • How to run GWAS using SAIGE
        • Adding new covariates in GWAS using REGENIE and SAIGE
        • How to run GWAS using plink2 (for unrelated individuals only)
        • How to run GWAS using GATE (survival models)
        • How to run trajGWAS
        • How to run GWAS using the Regenie unmodifiable pipeline
        • How to run an interaction GWAS using the Regenie unmodifiable pipeline
        • How to run survival analysis using GATE unmodifiable pipeline
        • How to run GWAS on imputed HLA alleles using Regenie
      • How to run finemapping pipeline
        • Finemapping with custom regions in DF12
        • Unmodifiable Finemapping pipeline
      • How to run colocalization pipeline
      • How to run the LDSC pipeline
      • How to run PRS pipeline
      • How to calculate PRS weights for FinnGen data
      • Sandbox path and pipeline mappings
      • If your pipeline job fails
      • Tips on how to find a pipeline job ID
      • Managing memory in Sandbox and data filtering tips
      • Using Google Life Sciences API in Sandbox
      • Pipelines is based on Cromwell and WDL
    • Billing information and where to find more details
      • Monitoring Sandbox costs by Sandbox billing report
      • Monitoring Sandbox costs directly from your Google billing account
  • Working outside the Sandbox
    • Risteys
    • Endpoint Browser
    • PheWeb
      • Volcano plots with LAVAA
    • Meta-analysis PheWeb(s)
    • Coding variant browser
    • Multiple Manhattan Plot (MMP)
      • How to prepare an input file for MMP
      • How to use MMP
    • LD browser
    • Green library data
  • FAQ
    • FinnGen Spin Offs
    • FinnGen access and accounts
      • How do I apply for data access?
      • What is "red" or "green" data?
      • I already have green data access, how do I apply for red data access?
      • I cannot access the /finngen/red?
      • How do I enable two-factor authentication (2FA)?
      • I cannot access my FinnGen account?
      • How to reset account credentials
      • What to do if you suspect your account has been compromised
      • Can't access your smartphone for 2FA?
      • How do I access the FinnGen members' area?
      • How do I access FinnGen All Sharepoint?
      • How can I view existing analysis proposals?
      • How can I join the FinnGen Slack?
      • How do I join the FinnGen Teams group?
      • How to apply SES sandbox access
      • How to request a FinnGen account?
    • FinnGen data
      • What to do if I think I found a mistake in the data?
      • What are the field/column names in FinnGen?
      • What covariates are used in FinnGen's core GWAS analyses?
      • Does FinnGen have lab results available?
      • Does FinnGen have family and relatedness information available?
      • Where can I find a list of unrelated individuals in FinnGen?
      • When moving from BCOR to .txt files, what does the column called "correlation" mean?
      • Is there really no participant birth year data?
      • How do I calculate time between events?
      • Can I select only the columns needed for my analysis to import into RStudio?
      • What is the difference is between LD-clumping and the Saige conditional analysis?
      • Can I download all pairwise LD data across the genome at once?
      • How to find latest data releases?
      • Why are there differences in the GWAS results between Data Freezes/Releases?
    • Where can I find
      • COVID association results?
      • Users' Meeting materials?
      • A list of what coding variants are enriched in Finland?
      • A comprehensive list of key file locations in FinnGen?
      • Medical code translations?
    • PheWeb
      • What are QQ and Manhattan plots?
      • How can I access PheWeb?
      • Are fine-mapping results that available in PheWeb also available as flat files?
      • Do the autoreports report the 95% or 99% credible set?
    • Registries
      • What do KELA reimbursement codes map to?
      • What's the cutoff date for FinnGen data?
    • Sandbox
      • What is the FinnGen Sandbox?
      • Why does my IVM freeze while loading data into R/Rstudio
      • Where can I find tutorials and documentation on Sandbox?
      • How do I get my own analysis code into Sandbox?
      • Where to ask for software you'd like to see in Sandbox
      • Can I share individual level data between different Sandbox users?
      • Is there a sun grid engine for running long scripts?
      • How to clear browser cache after sandbox update
      • How do I increase the window resolution on my IVM?
      • How can I view pdf, jpg and HTML files?
      • My Sandbox job was killed - why?
      • How to unzip files in the command line
      • Why aren't my keyboard/shortcuts working in Sandbox like they do in my local computer?
      • How to know if my pipeline job was failed due preemption of worker VM
    • Risteys
      • Why is the case number dropping after the "Check pre-conditions, main-only, mode, ICD version" step?
    • Endpoints
      • Where do I find the most recent list of FinnGen endpoints?
      • What does it mean when an endpoint has “mode” at the end?
      • What scenario would cause an NA (missing data) entry rather than a zero?
      • Does it mean anything when a value is written as $!$ instead of NA?
      • Why is there an inconsistency between ICD10 code J84.1 (IPF) and J84.112?
      • How are control endpoints calculated?
      • Can I get a list of FinnGen IDs by control group for my endpoint?
      • What does Level C mean in the endpoints data table?
      • What does the SUBSET_COV field show?
      • Why is there a "K." prefix on some endpoints?
      • Why there are fewer endpoints going from R5 (N = 2,925) to R8 (N = 2,202)?
      • Should I include primary care registry (PRIM_OUT) codes in my cohort definitions?
      • I found BL_AGE after FU_END_AGE in the endpoint data, how is it possible?
      • Why do individuals who are not dead have death age in endpoint data?
      • I found EVENT_AGE after FU_END_AGE in endpoint data, how is it possible?
    • Pipelines
      • Are there example SAIGE pipelines?
      • How do I apply finemapping to my SAIGE results?
      • Why Pipelines is claiming that my files or folders are not in /finngen/red?
    • Citing
      • How do I cite analysis using publicly available FinnGen results?
      • How do I cite FinnGen results that use individual level data?
    • For biobanks
      • How to apply for data return
    • Data Security and Protection
      • How do I report a data breach?
  • Release Notes
    • Data Releases 2025
    • Data Releases 2024
    • Data Releases 2023
    • Data Releases 2022
    • Data Releases 2021
  • Tool Catalog
  • Glossary
  • User Support
  • Data Protection & Security
Powered by GitBook
On this page

Was this helpful?

  1. Release Notes

Data Releases 2022

PreviousData Releases 2023NextData Releases 2021

Last updated 8 months ago

Was this helpful?

15 December 2022

  • FinnGen EA3 AMD data (version 2.0)

  • Version 2.0 of EA3 age-related macular degeneration (AMD) baseline and longitudinal files contains updated data from Tampere and Eastern Finland biobanks.

  • The EA3 data usage is restricted to the . See .

  • EA3 data is to be used only as stated in the respective EA3 study plan and need an accepted Analysis proposal as in all FinnGen studies.

  • The data and documentation location in the Sandbox Red Library

    • /finngen/library-red/EA3_AMD_2.0/

7 December 2022

  • FinnGen R10 coding variant association results

  • The data location in the Green Library:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/coding/

7 December 2022

  • Updated EA3_AMD_1.0 data

  • These updated EA3 age-related macular degeneration (AMD) files contain Visual acuity, Retinal thickness and anti-VEGF injection information in longitudinal format, and baseline information.

  • The EA3 data usage is restricted to the . See .

  • EA3 data is to be used only as stated in the respective EA3 study plan and need an accepted Analysis proposal as all FinnGen studies.

  • The data and documentation location in the Sandbox Red Library

    • /finngen/library-red/EA3_AMD_1.0/EA3_AMD_baseline_readme_1.0.txt

    • /finngen/library-red/EA3_AMD_1.0/EA3_AMD_long_readme_1.0.txt

    • /finngen/library-red/EA3_AMD_1.0/data/EA3_AMD_baseline_1.0.txt

    • /finngen/library-red/EA3_AMD_1.0/data/EA3_AMD_long_1.0.txt

1 December 2022

  • FinnGen R10 Regenie output files after step 1 (so-called nulls)

  • These files are needed to run the conditional analysis in Sandbox without the need to rerun the method from scratch.

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/finngen_R10/regenie_nulls_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

24 November 2022

  • FinnGen R10 annotated bgen files

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/finngen_R10/bgen/data/annotated/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

24 November 2022

  • FinnGen R10 meta-analysis results released to the Green Library

  • Data and documentation location in Google cloud:

    • FinnGen + UKBB + EstBB meta-analysis:

      • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/meta_analysis/ukbb_estbb/

    • FinnGen + UKBB meta-analysis:

      • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/meta_analysis/ukbb/

  • Meta-analysis browsers have been updated with the latest results:

17 November 2022

  • FinnGen R10 parental endpoint data (version 1.0)

  • These files contain parental endpoints that have been generated from cause of death register data.

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/finngen_R10/parental_endpoint_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt\

16 November 2022

  • FinnGen R10 service sector data (version 2.0)

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/finngen_R10/service_sector_data_2.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

15 November 2022

  • FinnGen R10 parental causes of death data (version 1.0)

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/finngen_R10/parental_causes_of_death_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

15 November 2022

  • FinnGen EA3 Fatty liver disease (FLD) data (version 3.0)

  • EA3 data is to be used only as stated in the respective EA3 study plan and need an accepted Analysis proposal as all FinnGen studies.

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/EA3_FLD_3.0/

14 November 2022

  • FinnGen R10 kinship data (version 2.0)

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/finngen_R10/kinship_2.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

11 November 2022

  • R10 heritabilities and genetic correlations released to the Green Library

  • Sumstats after the munging step done by ldsc are added. These sumstats can be fed directly to ldsc –rg to calculate correlations.

  • Data and documentation location in Google cloud:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/ldsc/

  • Paths to sumstats in FinnGen green library are formed like

    • data/munged/finngen_[RELEASE]_[PHENO].ldsc.sumstats.gz

8 November 2022

  • FinnGen R10 imputed HLA allele genotypes

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/finngen_R10/hla_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

4 November 2022

  • Proteomics data (Olink and SomaLogic)

    • The data contains the first 721 FinnGen individuals.

    • Original data received as is and one with basic QC done as specified in the readme files.

    • Readme files (readme.txt) are provided within each directory.

    • Consult the providers (Olink/ SomaLogic) documentation in original_data/ folder for details of the assays and data processing.

  • The data and documentation location in Sandbox red library:

    • SomaLogic: /finngen/library-red/EA5/proteomics/soma/first_batch/

    • Olink: FinnGen/library-red/EA5/proteomics/olink/first_batch/

2 November 2022

  • FinnGen R10 truncated endpoint file (version 1.0) released to Sandbox.

  • This file is similar to the first events endpoint file released earlier, but with small adjustments:

    • Follow-up date 31.12.2020 is used in the file. That date is the same date as the end of follow-up of death register.

      • FU_END_AGEs of the individuals have been re-calculated using the updated follow-up end date.

  • The data and documentation location in Sandbox red library:

    • /finngen/library-red/finngen_R10/truncated_endpoint_file_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

1 November 2022

  • FinnGen Release 10 GWAS, finemapping, conditional analysis, colocalization and autoreporting data released in the green library:

  • This release contains new results for:

    • Sex-specific association analysis results (under summary_stats)

    • Conditional analysis results

  • GWAS data location:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/summary_stats

  • Fine-mapping data location:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/finemap

  • Conditional analysis data location:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/conditional_analysis

  • Autoreporting data location:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/autoreporting

  • Colocalization data location:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/colocalization

  • Documentation location:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_documentation/

1 November 2022

  • FinnGen 10 HLA allele analysis results released to the green library.

  • Data location in the green library:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_data/hla/

  • Documentation available at:

    • gs://finngen-production-library-green/finngen_R10/finngen_R10_analysis_documentation/finngen_R10_hla_analysis.pdf

27 October 2022

  • FinnGen R10 visual impairment register data (version 1.0) released to Sandbox.

  • Data and documentation location in Sandbox red library

    • /finngen/library-red/finngen_R10/visual_impairment_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

20 October 2022

  • FinnGen EA3 FLD data (version 2.0)

  • The data contain laboratory and physiological measurements from the EA3 FLD cohort

  • Data was delivered from Helsinki, Auria and Eastern Finland biobanks.

  • Number of unique FINNGENID's: 55387

  • Data and documentation location in Sandbox red library

    • /finngen/library-red/EA3_FLD_2.0/

12 October 2022

  • Finngen R10 cluster plot data

  • Cluster plots location in the GREEN library in Google cloud:

    • gs://finngen-production-library-green/finngen_R10/cluster_plots/

  • The cluster plot tsv files, that contain genotype intensities per variant for a subset of samples used in the cluster plots are released to the Sandbox at:

    • /finngen/library-red/finngen_R10/cluster_plot_1.0/data/

  • Detailed paths of this data from the manifest file:

    • /finngen/library-red/finngen_R10/cluster_plot_1.0/manifest.txt

11 October 2022

  • FinnGen R10 PRS data (version 1.0).

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/prs_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

3 October 2022

  • FinnGen DF10 socioeconomic data released to the SES Sandbox.

  • Path to the data is the SES-Sandbox:

    • /finngen/pipelines/finngen_R10/socio_register_1.0/

3 October 2022

  • FinnGen R10 detailed version of Finnish cancer registry data (version 1.0)

  • The data contains 90303 FinnGen participants.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/cancer_detailed_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

3 October 2022

  • FinnGen R10 cancer screening data (version 1.0)

  • The files contain data from Finnish nationwide breast and cervical cancer screening.

    • The breast cancer screening file contains data of 143 233 FinnGen participants.

    • The cervix cancer screening file contains data of 206 253 FinnGen participants.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/cancer_screening_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

22 September 2022

  • FinnGen DF10 genotype data converted to plink format released to the Sandbox

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/genotype_plink_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

19 September 2022

  • FinnGen R10 infectious disease register corona data (version 1.0)

  • The data contains 72350 (cumulative number) corona virus positive FinnGen participants.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/infectious_disease_register_corona_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

19 September 2022

  • FinnGen R10 vaccination register data (version 1.0)

  • The data contains vaccination data of 385 747 FinnGen participants.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/vaccination_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

9 September 2022

  • FinnGen DF10 reproductive history data

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/birth_and_dvv_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

5 September 2022

  • FinnGen DF10 datasets released to Sandbox

  • Relatedness analysis results

    • /finngen/library-red/finngen_R10/kinship_1.0/

  • PCA

    • /finngen/library-red/finngen_R10/pca_1.0/

  • Endpoints and covariates used in core DF10 analyses

    • /finngen/library-red/finngen_R10/analysis_covariates/

  • Bgen chunks for running GWAS in 40k variant chunks:

    • /finngen/library-red/finngen_R10/bgen/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

29 August 2022

  • FinnGen EA3 AMD data (version 1.0)

  • These files contain diagnoses, procedures and structural form answers from AMD patients. Data was delivered to us from Helsinki biobank.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/EA3_AMD_bcb_1.0/

29 August 2022

  • FinnGen DF10 chip data (version 1.0)

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/chipd_1.0/data/

    • Markdown of all genotype QC for both chip and imputation data /finngen/library-red/finngen_R10/R10_genotype_qc.md

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

24 August 2022

  • FinnGen DF10 service sector data (version 1.0)

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/service_sector_data_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

23 August 2022

  • Please find DF10 Register of Congenital Malformations data (version 1.0)

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R10/malformation_register_1.0

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R10/catalog/catalog.txt

15 August 2022

  • FinnGen release 10 genotype and phenotype data

  • Finngen R10 data was imputed using SISuV4.2 reference panel that contains 8,554 high coverage [25x] WGS Finnish individuals.

  • The current data statistics are:

    • Number of individuals with genotypes = 430 897

    • Number of individuals with endpoints = 429 209

    • Number of imputed variants = 21 311 942

    • Number of endpoints = 4 519

  • Data and documentation location in Sandbox:

    • Genotypes in VCF format & Documentation:

      • /finngen/library-red/finngen_R10/genotype_1.0/

    • Phenotypes & Documentation:

      • /finngen/library-red/finngen_R10/phenotype_1.0/

    • Detailed paths to library red data from catalog:

      • /finngen/library-red/finngen_R10/catalog/catalog.txt

30 June 2022

  • Data and documentation location in Sandbox:

    • /finngen/library-green/finngen_R6/finngen_R6_medical_codes/fgVNR.tsv

    • /finngen/library-green/finngen_R6/finngen_R6_medical_codes/fgVNR_readme.txt

    • /finngen/library-green/finngen_R6/finngen_R6_medical_codes/fgVNR_table_definition.tsv

  • Data and documentation location in Google cloud:

    • gs://finngen-production-library-green/finngen_R6/finngen_R6_medical_codes

28 June 2022

  • FinnGen R9 service_sector_data_all_registers (version 1.0)

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/service_sector_data_all_registers_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

21 June 2022

  • FinnGen R9 HLA analysis results.

  • Analysis results location in Google cloud:

    • gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/hla/

  • Analysis results location in Sandbox:

    • /finngen/library-green/finngen_R9/finngen_R9_analysis_data/hla/

  • The analysis readme location in Google cloud:

    • gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_documentation/finngen_R9_hla_analysis.md

    • gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_documentation/finngen_R9_hla_analysis.pdf

  • The analysis readme location in Sandbox:

    • /finngen/library-green/finngen_R9/finngen_R9_analysis_documentation/finngen_R9_hla_analysis.md

    • /finngen/library-green/finngen_R9/finngen_R9_analysis_documentation/finngen_R9_hla_analysis.pdf

10 June 2022

  • FinnGen R9 infectious disease register corona data (version 2.0)

  • This data contains 57333 (cumulative number) corona virus positive FinnGen participants. After these 50 000 new cases over the last 6 months, almost 15 % of Finngen participants have now been tested positive for corona.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/infectious_disease_register_corona_2.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

31 May 2022

  • FinnGen R9 PRS data (version 1.0) released to Sandbox.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/prs_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

16 May 2022

  • FinnGen R9 parental causes of death (version 1.0) released to Sandbox.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/parental_causes_of_death_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

13 May 2022

  • Results from FinnGen release 9 meta-analyses with UKBB and EstBB

  • Data and documentation location in Google cloud:

    • gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/ukbb_meta/

    • gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/ukbb_estbb_meta/

  • Data and documentation location in Sandbox:

    • /finngen/library-green/finngen_R9/finngen_R9_analysis_data/ukbb_meta/

    • /finngen/library-green/finngen_R9/finngen_R9_analysis_data/ukbb_estbb_meta/

2 May 2022

  • FinnGen R9 visual impairment register data (version 1.0)

  • Data and documentation location in Sandbox: /finngen/library-red/finngen_R9/visual_impairment_register_1.0/

  • Detailed paths to library red data from catalog: /finngen/library-red/finngen_R9/catalog/catalog.txt

26 April 2022

  • FinnGen R9 kidney disease register data (version 1.0)

  • Data and documentation location in Sandbox: /finngen/library-red/finngen_R9/kidney_disease_register_1.0/

  • Detailed paths to library red data from catalog: /finngen/library-red/finngen_R9/catalog/catalog.txt

14 April 2022

  • R9 coding variant analysis results have been released in green library.

  • Data location in Google cloud: gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/coding/full

  • Documentation location in Google cloud: gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/coding/R9_coding_variant_results_full.README

  • Data location in Sandbox: /finngen/library-green/finngen_R9/finngen_R9_analysis_data/coding/full

  • Documentation location in Sandbox: /finngen/library-green/finngen_R9/finngen_R9_analysis_data/coding/R9_coding_variant_results_full.README

13 April 2022

  • R9 GWAS, finemapping, colocalization and autoreporting data in green library.

  • Notable changes:

  • GWAS analysis performed for 2272 endpoints, including 3 quantitative

  • Maximum region changed from 6 MB to 10 MB in finemapping

  • Data location in Google cloud:

    • GWAS: gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/summary_stats

    • fine-mapping: gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/finemap

    • autoreporting: gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/autoreporting

    • colocalization: gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_data/colocalization

    • The documentation is available at: gs://finngen-production-library-green/finngen_R9/finngen_R9_analysis_documentation/

  • Data location in Sandbox:

    • GWAS: /finngen/library-green/finngen_R9/finngen_R9_analysis_data/summary_stats

    • fine-mapping: /finngen/library-green/finngen_R9/finngen_R9_analysis_data/finemap

    • autoreporting: /finngen/library-green/finngen_R9/finngen_R9_analysis_data/autoreporting

    • colocalization: /finngen/library-green/finngen_R9/finngen_R9_analysis_data/colocalization

    • The documentation is available at: /finngen/library-green/finngen_R9/finngen_R9_analysis_documentation/

13 April 2022

  • R8 susie finemapping version 2.0 data in green library.

  • Changes:

    • Susie version 0.11.92 used

    • individual Bayes factors for each variant added to data

  • This data does not include FINEMAP finemapping, since that has not changed since 1.0.

  • The new susie data is available at: gs://finngen-production-library-green/finngen_R8/finngen_R8_analysis_data/finemap_2.0

  • Documentation, including file descriptions, is available at :

    • Data location in Google cloud: gs://finngen-production-library-green/finngen_R8/finngen_R8_analysis_documentation/readme_r8_finemap_2.0.md

    • Data location in Sandbox: /finngen/library-green/finngen_R8/finngen_R8_analysis_documentation/readme_r8_finemap_2.0.md

11 April 2022

  • FinnGen R9 service sector data (version 1.0)

  • Data includes information from 392 374 FinnGen participants about

    • Service sectors and specialties of the Hilmo inpatient and outpatient register data

    • Service sector, contact type and profession of the primary care data

    • Reimbursement cost information of the kela drug register purchase data

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/service_sector_data_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

1 April 2022

  • FinnGen R9 truncated endpoint file (version 1.0)

    • Truncated endpoint file is similar to the first events endpoint file released earlier, but with small adjustments.

      • All registers have a common follow-up date 31.12.2019 in this file.

      • The FU_END_AGEs of the individuals have been re-calculated using the common follow-up end date.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/truncated_endpoint_file_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

31 March 2022

  • FinnGen R9 cancer datasets

    • cancer_screening_1.0

      • This includes Finnish nationwide breast and cervical cancer screening data.

    • cancer_detailed_1.0

      • This is the detailed version of Finnish cancer registry data.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/cancer_screening_1.0/

    • /finngen/library-red/finngen_R9/cancer_detailed_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

17 March 2022

  • FinnGen R9 infectious disease register corona data (version 1.0)

  • The data consists of 7329 (cumulative number) corona virus positive FinnGen participants until 31st December 2021.

  • This release has been processed from the same raw data as the previous R8_3.0, but filtered with R9 genotype list.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/infectious_disease_register_corona_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

16 March 2022

  • FinnGen R9 vaccination register data (version 1.0)

  • Vaccination data of 348 982 FinnGen participants

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/vaccination_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

16 March 2022

  • FinnGen R9 birth and dvv register data (version 1.0)

  • The data combines information from population register (DVV) (since 1953) and medical birth register (since 1987).

  • Data from 167 920 FinnGen mother participants, and 63 742 child participants.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/birth_and_dvv_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

10 March 2022

  • Finngen R9 cluster plot data files

  • contain genotype intensities per variant for a subset of samples that are used in cluster plots (released in the green library).

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/cluster_plot_1.0/

  • Detailed paths to library red data from manifest:

    • /finngen/library-red/finngen_R9/cluster_plot_1.0/manifest.txt

3 March 2022 (edit: Aug 2024 - meta-analysis results with Estonian biobank are no longer available)

  • FinnGen release 8, UKBB, and Estonia Biobank meta-analysis results have been updated with 48 new meta-analysed endpoints (total 98 meta-analysed endpoints)

  • The results include summary statistics from meta-analysis (including leave-one-out results) and autoreporting results.

  • Data in green library:

    • Data location in Google cloud: gs://finngen-production-library-green/finngen_R8/finngen_R8_analysis_data/ukbb_estbb_meta/

    • Data location in Sabox:/finngen/library-green/finngen_R8/finngen_R8_analysis_data/ukbb_estbb_meta/

  • Documentation in green library:

    • Data location in Google cloud: gs://finngen-production-library-green/finngen_R8/finngen_R8_analysis_data/ukbb_estbb_meta/

    • Data location in Sabox: /finngen/library-green/finngen_R8/finngen_R8_analysis_data/ukbb_estbb_meta/

18 February 2022

  • FinnGen R9 malformation register data (version 1.0)

  • The Register of Congenital Malformations contains national-level data on congenital chromosomal and structural anomalies, as well as a few other congenital anomalies like congenital hypothyroidism and teratomas, detected or suspected in stillborn and live born infants and foetuses.

  • FinnGen data consist malformation register data from 7527 FinnGen mother participants and 1992 FinnGen child participants.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/malformation_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

15 February 2022

  • FinnGen R9 chip data (version 1.0)

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R9/chipd_1.0/

2 February 2022

  • FinnGen release 9 genotype and phenotype data

  • Finngen R9 data was imputed using SISuV4 reference panel that contains 8,554 high coverage [25x] WGS Finnish individuals.

  • The current data statistics are:

    • Number of individuals with genotypes = 392 649

    • Number of individuals with endpoints = 392 423

    • Number of imputed variants = 20 175 454

    • Number of endpoints = 4 526

  • Data and documentation location in Sandbox:

    • Genotypes in VCF format & Documentation:

      • /finngen/library-red/finngen_R9/genotype_1.0/

    • Genotypes in PLINK format and Documentation:

      • /finngen/library-red/finngen_R9/genotype_plink_1.0

    • Phenotypes & Documentation:

      • /finngen/library-red/finngen_R9/phenotype_1.0/

    • PCA/kinship results and analysis covariates:

      • /finngen/library-red/finngen_R9/kinship_1.0/

      • /finngen/library-red/finngen_R9/pca_1.0/

      • /finngen/library-red/finngen_R9/prune_1.0/

      • /finngen/library-red/finngen_R9/analysis_covariates/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R9/catalog/catalog.txt

Data is described in .

The EA3 data usage is restricted to the . See .

See

Data is described on page.

The socioeconomic data of FinnGen study subjects is available to a separate Sandbox environment (SES-Sandbox) and can be accessed only by researchers in Finland The registry data and accessing proses are described in the section.

Description of the

See the description of .

is described in the FinnGen Analyst Handbook.

mappings released to the Sandbox. For example, dosage and strength of drugs are mapped and available in the Sandbox and FinnGen library green.

The data is described in .

The data is described in section .

The data is described in

Data is described in

More information about the

Reproductive history data is described in FinnGen Analyst Handbook section

The Pheweb instance running with these results is updated and can be found at .

Data is described in section .

EA3 projects
AMD project documentation from the Members Area
EA3 projects
AMD project documentation from the Members Area
https://metaresults-est-ukbb.finngen.fi/
https://metaresults-ukbb.finngen.fi/
Service Sector Data
EA3 projects
FLD project documentation from Members Area
description of truncated endpoint data
The Finnish Register of Visual Impairment
How to apply SES sandbox access
reproductive history data
service sector data
The Register of Congenital Malformations
VNR code
Other registry data files in Sandbox
Other registry data files in Sandbox
Other registry data files in Sandbox
Other registry data files in Sandbox
truncated endpoint file.
Other registry data files in Sandbox
https://metaresults-ukbb.finngen.fi/
Other registry data files