LogoLogo
FinnGen Handbook
  • Introduction
  • Where to begin
    • Quick guides
      • New to FinnGen
      • Green data users
      • Red data users
    • I'm new to FinnGen, where is the best place for me to start?
    • What kind of questions can I ask of FinnGen data?
    • How do I make a custom endpoint?
    • How do I run a GWAS of a phenotype I created myself?
    • I'm interested in FinnGen rare variant phenotypes
  • Background Concepts
    • Basics of Genetics
    • Linkage Disequilibrium (LD)
    • Genotype Imputation
    • Genotype Data Processing and Quality Control (QC)
    • GWAS Analysis
    • P Values
    • Heritability and genetic correlations
    • Finemapping
    • Conditional analysis
    • Colocalization
    • Using Polygenic Risk Scores
    • PheWAS analysis
    • Survival analysis
    • Longitudinal Data Analysis
    • GWAS Association to Biological Function
    • Genetic Data Resources outside FinnGen
    • Getting Started with Unix
    • Getting Started with R
    • Structure of the FinnGen project
    • Finnish gene pool and health register data
  • FinnGen Data Specifics
    • FinnGen Data Freezes and Releases
    • Analysis proposals
      • What is a FinnGen analysis proposal and when do I need to submit one?
      • How do I submit an analysis proposal?
      • How are analysis proposals handled?
      • What is a FinnGen bespoke analysis proposal and when do I need to submit one?
      • How do I submit a bespoke analysis proposal?
      • How are bespoke analysis proposals handled?
      • What is the difference between FinnGen analysis proposals and FinnGen bespoke analyses?
      • Existing analysis proposals
    • Finnish Health Registries and Medical Coding
      • Finnish health registries
      • Register data pre-processing
      • Data Masking/Blurring of Visit Dates
      • International and Finnish Health Code Sets
      • More information on health code sets
      • VNR code mapping to RxNorm
      • Register code translation files
    • Endpoints
      • FinnGen clinical endpoints
      • History of creating the FinnGen endpoints
      • Location of FinnGen Endpoint and Control Description Files
        • What's new in DF13 endpoints
        • What’s new in DF12 endpoints
        • What’s new in DF11 endpoints
        • What’s new in the DF10 endpoints
        • What’s new in DF9 endpoints
        • What’s new in DF8 endpoints
      • Interpretation of Endpoint Definition file
      • Location of Endpoint Quality Control Report
      • Creating a User-defined Endpoint(s)
      • Requesting a User-defined Endpoint to be included in Core Analysis
      • Complete follow-up time of the FinnGen registries – primary endpoint data
        • Survival analysis using the truncated endpoint file – secondary endpoint data
    • Biobanks in Finland
    • Publishing FinnGen results
      • Preparing manuscripts or conference abstracts
      • The 1-year “Exclusivity Period” Policy
      • List of Publications using FinnGen Data
      • How to share GWAS summary statistics with FinnGen community
      • How to publish GWAS summary statistics
      • Public Result Releases
    • Red Library Data (individual level data)
      • Genotype data
        • Genotype Arrays Used
          • Legacy cohorts and chips
        • Imputation Panel
          • Sisu v4 reference panel
          • Sisu v3 reference panel
          • Sisu v4.2 reference panel
            • Variant-wise QC metrics file
        • Genome build used in FinnGen
        • Genotype Data Processing Flow
        • Genotype Files in Sandbox
          • Imputed genotypes in VCF format
          • Imputed genotypes in BGEN format
          • Imputed genotypes in PLINK format
          • Chip data
          • Imputed HLA alleles
          • Principal components analysis (PCA) data
          • Kinship data
          • Analysis covariates
          • Polygenic risk scores (PRS)
          • Genetic Ancestry
          • Genetic relationships (GRM)
          • Mosaic chromosomal alterations (mCA)
          • Prune data (R9)
          • Imputed STR genotypes (R8)
      • Phenotype data
        • Register data
        • Detailed longitudinal data
          • Splitting combination codes in detailed longitudinal data
        • Service sector data
          • Service sector data code translations
        • Endpoint and endpoint longitudinal data
        • Kanta lab values
          • Data
          • FAQ
          • How-to guides
        • Kanta prescriptions
        • Minimum extended phenotype data
          • Extracting minimum phenotype data per biobank
          • DNA isolation protocols per biobank
        • Minimum longitudinal data
        • Minimum phenotype data (before R11)
        • Cohort data (before R11)
        • Other register data files in Sandbox
          • Register of Congenital Malformations
          • Finnish Registry for Kidney Diseases
          • Reproductive history data
          • Finnish Cancer Registry: Cervical cancer screening
          • Finnish Cancer Registry: Breast cancer screening
          • Finnish Cancer Registry: Detailed cancer data
          • Finnish Register of Visual Impairment
          • Parental cause of death data
          • Ejection fraction data
          • Finnish National Infectious Disease Register
          • Finnish National Vaccination Register
          • Covid-19 primary care data
          • Blood donor data from the Finnish Red Cross Blood Service (FRCBS)
          • Dental data
          • Socioeconomic data
          • Hilmo and avohilmo extended data
      • Omics data
        • Proteomics
          • Expansion Area 5 proteomics data
          • FinnGen 3 proteomics data
        • Metabolomics
        • Single-cell transcriptomics and immune profiling
        • High-content cell imaging
        • Full blood counts and clinical chemistry
      • Hospital administered medications
      • Whole exome sequencing (WES) data
    • Green Library Data (aggregate data)
      • What is "Green" Data?
      • Accessing Green Data
      • Other analyses available
        • Colocalizations in FinnGen
        • Autoreporting – information on overlaps
          • Index of Autoreporting variables
        • HLA
        • LoF burden test
        • Meta-analyses
      • Core analysis results files
        • Recessive GWAS results format
        • Variant annotation file format
        • Genotype cluster plots format
        • GWAS results format
        • Finemapping results format
        • Colocalization results format
          • Results format in colocalization before DF13
        • Autoreporting results format
        • Sex-specific GWAS results format
        • UKBB-FinnGen meta-analysis file formats
        • Pairwise endpoint genetic correlation format
        • Heritabilities
        • Coding variant associations format
        • HLA association results
        • Proteomics results
        • Coding variant results including CHIP EWAS (Exome-Wide Association Scan)
        • Kanta lab association results v1
    • Disease specific Task Force data
      • Inflammatory bowel disease (IBD) SNOMED codes data
    • Expansion Area 3 (EA3) studies
      • EA3 study: Fatty liver disease study and data in Sandbox
      • EA3 study: Age-related macular degeneration study and data in Sandbox
      • EA3 study: Women's health studies
        • EA3 study: Women’s health – Endometriosis and data in Sandbox
        • EA3 study: Human papilloma virus-related gynecological lesions, and data in Sandbox
        • EA3 study: Women’s health – PCOS and infertility study, and data in Sandbox
      • EA3 study: Diabetic Kidney Disease and Rare Kidney Disease study and data in Sandbox
      • EA3 study: Oncology studies
        • EA3 study: Oncology – Breast cancer study and data in Sandbox
        • EA3 study: Oncology –Prostate cancer study and data in Sandbox
        • EA3 study: Oncology – Ovarian cancer study and data in Sandbox
      • EA3 study: Pulmonary diseases (IPF, asthma and COPD) study and data in Sandbox
      • EA3 study: Immune-mediated diseases
      • EA3 study: Heart Failure study and data in Sandbox
      • FinnGen EA3 leads
  • Disease Specific Task Forces
    • Inflammatory bowel disease (IBD)
    • Kidney Diseases
    • Eye Diseases
    • Rheumatic Diseases
    • Atopic Dermatitis
    • Pulmonary Diseases
    • Neurological Diseases
    • Heart Failure
    • Fibrotic Diseases
    • Metabolic diseases
    • Parkinson's diseases
  • Working in the Sandbox
    • How to get started with Sandbox
    • What is Sandbox and what can you do there
    • What do we mean by "red" and "green" data?
    • General workflows for the most common analyses
    • Quirks and Features
      • Managing your files in Sandbox
      • Navigating the Sandbox
      • How to save Sandbox window configuration
      • Copying and pasting in and out of your IVM
      • How to report issues from within the Sandbox
      • Sharing individual-level data within the Sandbox
      • How to download results from your IVM
        • Sandbox download requests – rules and examples for minimum N
      • Keyboard combinations
      • Running analyses in your IVM vs. Pipelines
      • Timeouts and saving your work (backups, github)
      • How to install a R package into Sandbox?
        • How to install R packages with many dependencies
      • Install R and Python packages from the local Sandbox repository
      • How to install a Python package into Sandbox
      • How to install GNU Debian package
      • How to upload your own files to IVM via /finngen/green
      • How to remove files from /finngen/green
      • Using Sandbox as a Chrome application (full screen mode)
      • How to reset your finngen.fi account password
      • Sandbox IVM tool request handling policy
      • Docker images
        • How to get a new Docker image to Sandbox
        • How to mount data into Docker container image
        • Containers available to Sandbox
        • Containers with user customized tool sets
        • How to write a Docker file
        • Anaconda Python environment in the Sandbox
      • Python Virtual Environment in Sandbox
      • How to shut down your IVM
    • Which tools are available?
      • FinnGen exome query tool
      • Custom GWAS tools
        • Custom GWAS GUI tool
        • Custom GWAS command line (CLI) tool
          • Custom GWAS CLI Binary mode
          • Custom GWAS CLI Quantitative mode
        • How to make your summary stats viewable in a PheWeb-style?
        • Finemapping of Custom GWAS analyses
        • PheWeb Users Input Validator tool
        • Conditional analysis of Custom GWAS analyses
      • Pipelines
      • Pre-installed Linux tools
      • PGS Browser
      • Lmod Linux tools
      • Anaconda Python module with ready set of scientific packages
      • Python packages
      • R packages
      • Atlas
        • Quick guide
          • Introduction to OHDSI, OMOP CDM and Atlas
          • From research question to concepts and cohort building
          • Using Atlas in Sandbox
          • Examples on cohort building with Atlas
        • Detailed guide
          • Atlas data model
          • Standard and non-standard codes
          • How to define a cohort in Atlas
            • Select FinnGen data release in Atlas for Search
            • How to define a simple ICD case-control cohort in Atlas
              • Define a simple ICD Concept Set in Atlas
              • Define a simple ICD case cohort in Atlas
              • Define a simple ICD control cohort in Atlas
            • Concept Sets
              • Create Concept Sets using descendants
              • Exclude and Remove codes from Concept Set
              • Simplify Concept Sets that use standard code descendants
              • Create Concept Sets using equivalent standard and non-standard codes
              • View standard code hierarchy in Atlas
            • Cohort Definitions
              • Using the Death register in Atlas
              • Filtering by clinical registries in Atlas
              • Filtering by demographic criteria in Atlas
              • Defining exit rules for a cohort in Atlas
              • Selecting the correct box in Atlas for events and medical codes
            • How to export FinnGen IDs from Atlas
          • Downstream analyses after the Atlas cohorts are created
          • Data Release Summary Statistics in Atlas
          • Cohort Summary Statistics in Atlas
            • Time-dependent Cohort Summary Statistics in Atlas
            • Event inclusion in Cohort Summary Statistics in Atlas
          • Cohort Pathways
      • BigQuery (relational database)
      • Atlas vs BigQuery cohorts
      • Genotype Browser
      • Cohort Operations tool (CO)
        • Upload cohorts to CO
        • Combine cohorts with CO
        • Operate on Atlas cohorts and data with entries and exit events
        • Explore code and endpoint enrichments with CO (CodeWAS)
        • Explore endpoint overlaps with CO
        • Compare custom endpoint to FinnGen endpoint with CO
        • Launch custom GWAS with CO
        • Export FinnGen IDs using CO
        • Understanding phenotypic overlaps using CO
      • Trajectory Visualization Tool (TVT)
        • Running TVT
          • Filtering timelines with TVT
          • Reordering timelines with TVT
          • Clustering timelines with TVT
          • Viewing TVT results
        • Viewing Atlas, CO, and Genotype cohorts in TVT
        • Exporting cohorts from TVT
        • TVT help page
      • LifeTrack
      • Miscellaneous helper scripts/tools
        • Tool to annotate variants with RSIDs
        • Proper translations of medical, service sector and provider codes
        • BigQuery Connection – R
          • Case study – All register data for a person
          • Case study – UpSet plot
          • Case study – Tornado plot
          • Case study – defining simple cohorts using medical codes for running case-control GWAS
        • BigQuery Connection - Python
          • BigQuery Python - Downstream analysis - Active Ingredient - Bar plot
          • BigQuery Python - Case Study - Sex different - Tornado plot
          • BigQuery Python - Case Study - Comorbidity - Upset plot
          • BigQuery Python - Case Study - Patient Timeline - Scatter plot
      • Sandbox internal API for software developers
    • Working with Phenotype Data
      • Variant PheWas
      • How to select controls for your cases
      • Using the R libraries to look at Phenotype data
      • How to check case counts from the data
      • Creating your own user-defined endpoint
    • Working with Genotype Data
      • Genotype Browser how to
      • Cluster Plots
      • ClusterPlot viewer V3C
      • Rare Variant Calling in V3C
      • Create map of allele
      • Genotypes from VCF files
      • Variant PheWas
      • Interpreting rare-variant analysis results
      • Tools for geno-pheno explorations
        • Example: transferring data from Genotype Browser to LifeTrack
        • Example: Visualizing Genotype Browser output data with TVT
    • Running analyses in Sandbox
      • How to run survival analyses
      • How to create custom endpoint using bigquery: example
      • How to use the Pipelines tool
      • How to submit a pipeline from the command line (finngen-cli)
      • How to run genome-wide association studies (GWAS)
        • How to run GWAS using REGENIE
        • Running quantitative GWAS with REGENIE
        • Conditional analysis
        • Conditional Analysis with custom regions and loci
        • How to run GWAS using SAIGE
        • Adding new covariates in GWAS using REGENIE and SAIGE
        • How to run GWAS using plink2 (for unrelated individuals only)
        • How to run GWAS using GATE (survival models)
        • How to run trajGWAS
        • How to run GWAS using the Regenie unmodifiable pipeline
        • How to run an interaction GWAS using the Regenie unmodifiable pipeline
        • How to run survival analysis using GATE unmodifiable pipeline
        • How to run GWAS on imputed HLA alleles using Regenie
      • How to run finemapping pipeline
        • Finemapping with custom regions in DF12
        • Unmodifiable Finemapping pipeline
      • How to run colocalization pipeline
      • How to run the LDSC pipeline
      • How to run PRS pipeline
      • How to calculate PRS weights for FinnGen data
      • Sandbox path and pipeline mappings
      • If your pipeline job fails
      • Tips on how to find a pipeline job ID
      • Managing memory in Sandbox and data filtering tips
      • Using Google Life Sciences API in Sandbox
      • Pipelines is based on Cromwell and WDL
    • Billing information and where to find more details
      • Monitoring Sandbox costs by Sandbox billing report
      • Monitoring Sandbox costs directly from your Google billing account
  • Working outside the Sandbox
    • Risteys
    • Endpoint Browser
    • PheWeb
      • Volcano plots with LAVAA
    • Meta-analysis PheWeb(s)
    • Coding variant browser
    • Multiple Manhattan Plot (MMP)
      • How to prepare an input file for MMP
      • How to use MMP
    • LD browser
    • Green library data
  • FAQ
    • FinnGen Spin Offs
    • FinnGen access and accounts
      • How do I apply for data access?
      • What is "red" or "green" data?
      • I already have green data access, how do I apply for red data access?
      • I cannot access the /finngen/red?
      • How do I enable two-factor authentication (2FA)?
      • I cannot access my FinnGen account?
      • How to reset account credentials
      • What to do if you suspect your account has been compromised
      • Can't access your smartphone for 2FA?
      • How do I access the FinnGen members' area?
      • How do I access FinnGen All Sharepoint?
      • How can I view existing analysis proposals?
      • How can I join the FinnGen Slack?
      • How do I join the FinnGen Teams group?
      • How to apply SES sandbox access
      • How to request a FinnGen account?
    • FinnGen data
      • What to do if I think I found a mistake in the data?
      • What are the field/column names in FinnGen?
      • What covariates are used in FinnGen's core GWAS analyses?
      • Does FinnGen have lab results available?
      • Does FinnGen have family and relatedness information available?
      • Where can I find a list of unrelated individuals in FinnGen?
      • When moving from BCOR to .txt files, what does the column called "correlation" mean?
      • Is there really no participant birth year data?
      • How do I calculate time between events?
      • Can I select only the columns needed for my analysis to import into RStudio?
      • What is the difference is between LD-clumping and the Saige conditional analysis?
      • Can I download all pairwise LD data across the genome at once?
      • How to find latest data releases?
      • Why are there differences in the GWAS results between Data Freezes/Releases?
    • Where can I find
      • COVID association results?
      • Users' Meeting materials?
      • A list of what coding variants are enriched in Finland?
      • A comprehensive list of key file locations in FinnGen?
      • Medical code translations?
    • PheWeb
      • What are QQ and Manhattan plots?
      • How can I access PheWeb?
      • Are fine-mapping results that available in PheWeb also available as flat files?
      • Do the autoreports report the 95% or 99% credible set?
    • Registries
      • What do KELA reimbursement codes map to?
      • What's the cutoff date for FinnGen data?
    • Sandbox
      • What is the FinnGen Sandbox?
      • Why does my IVM freeze while loading data into R/Rstudio
      • Where can I find tutorials and documentation on Sandbox?
      • How do I get my own analysis code into Sandbox?
      • Where to ask for software you'd like to see in Sandbox
      • Can I share individual level data between different Sandbox users?
      • Is there a sun grid engine for running long scripts?
      • How to clear browser cache after sandbox update
      • How do I increase the window resolution on my IVM?
      • How can I view pdf, jpg and HTML files?
      • My Sandbox job was killed - why?
      • How to unzip files in the command line
      • Why aren't my keyboard/shortcuts working in Sandbox like they do in my local computer?
      • How to know if my pipeline job was failed due preemption of worker VM
    • Risteys
      • Why is the case number dropping after the "Check pre-conditions, main-only, mode, ICD version" step?
    • Endpoints
      • Where do I find the most recent list of FinnGen endpoints?
      • What does it mean when an endpoint has “mode” at the end?
      • What scenario would cause an NA (missing data) entry rather than a zero?
      • Does it mean anything when a value is written as $!$ instead of NA?
      • Why is there an inconsistency between ICD10 code J84.1 (IPF) and J84.112?
      • How are control endpoints calculated?
      • Can I get a list of FinnGen IDs by control group for my endpoint?
      • What does Level C mean in the endpoints data table?
      • What does the SUBSET_COV field show?
      • Why is there a "K." prefix on some endpoints?
      • Why there are fewer endpoints going from R5 (N = 2,925) to R8 (N = 2,202)?
      • Should I include primary care registry (PRIM_OUT) codes in my cohort definitions?
      • I found BL_AGE after FU_END_AGE in the endpoint data, how is it possible?
      • Why do individuals who are not dead have death age in endpoint data?
      • I found EVENT_AGE after FU_END_AGE in endpoint data, how is it possible?
    • Pipelines
      • Are there example SAIGE pipelines?
      • How do I apply finemapping to my SAIGE results?
      • Why Pipelines is claiming that my files or folders are not in /finngen/red?
    • Citing
      • How do I cite analysis using publicly available FinnGen results?
      • How do I cite FinnGen results that use individual level data?
    • For biobanks
      • How to apply for data return
    • Data Security and Protection
      • How do I report a data breach?
  • Release Notes
    • Data Releases 2025
    • Data Releases 2024
    • Data Releases 2023
    • Data Releases 2022
    • Data Releases 2021
  • Tool Catalog
  • Glossary
  • User Support
  • Data Protection & Security
Powered by GitBook
On this page

Was this helpful?

  1. Release Notes

Data Releases 2021

27 December 2021

  • Finngen R8 service sector detailed data

  • These data files contain:

    • Information about service sector, specialty and contact type from the Hilmo inpatient and outpatient register, and primary health care register

    • Information about drug reimbursement costs from the kela drug purchase register.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/service_sector_detailed_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

20 December 2021

  • FinnGen R8 vaccination register data (version 2.0)

    • The data contains vaccination data of 316 551 FinnGen participants.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/vaccination_register_2.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

20 December 2021

  • infectious disease register data (corona, version 3.0)

    • The data contains 7058 (cumulative number) corona virus positive FinnGen participants.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/infectious_disease_register_corona_3.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

20 December 2021

  • Lists of significant coding variant associations for all analyzed endpoints in FinnGen R8

  • Data in the green library:

    • Data location in Google cloud: gs://finngen-production-library-green/finngen_R8/finngen_R8_analysis_data/summary_stats/coding_variants/

    • Data location in Sabox: /finngen/library-green/finngen_R8/finngen_R8_analysis_data/summary_stats/coding_variants/

  • Documentation in the green library:

    • Data location in Google cloud: gs://finngen-production-library-green/finngen_R8/finngen_R8_analysis_data/summary_stats/coding_variants/README

    • Data location in Sabox: /finngen/library-green/finngen_R8/finngen_R8_analysis_data/summary_stats/coding_variants/README

17 December 2021

  • FinnGen R8 chip data (version 1.0)

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/chipd_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

29 November 2021

  • FinnGen release 8 and UKBB meta-analysis results

  • Data in green library:

    • Data location in Google cloud: gs://finngen-production-library-green/finngen_R8/finngen_R8_analysis_data/ukbb_meta/

    • Data location in Sabox: /finngen/library-green/finngen_R8/finngen_R8_analysis_data/ukbb_meta/

  • Documentation in green library:

    • Data location in Google cloud: gs://finngen-production-library-green/finngen_R8/finngen_R8_analysis_data/ukbb_meta/readme

    • Data location in Sabox: /finngen/library-green/finngen_R8/finngen_R8_analysis_data/ukbb_meta/readme

26 November 2021

  • SISuv4 LDstore correlation (BCOR) files

  • Data in green library:

    • Data location in Google cloud: gs://finngen-production-library-green/imputation_panel/v4/LD/

    • Data location in Sabox: /finngen/library-green/imputation_panel/v4/LD/

  • Documentation in green library:

    • Data location in Google cloud: gs://finngen-production-library-green/imputation_panel/v4/LD/README.md

    • Data location in Sabox: /finngen/library-green/imputation_panel/v4/LD/README.md

24 November 2021

  • FinnGen R8 parental causes of death data (version 1.0)

  • Number of unique FINNGENIDs: 262 248

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/parental_causes_of_death_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

23 November 2021

  • FinnGen R8 visual impairment data (version 1.0)

  • Number of unique FINNGENIDs: 2401

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/visual_impairment_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

15 November 2021

  • R8 core analyses:

  • FinnGen R8 GWAS analysis data

    • Updated version of regenie

    • These files contain the summary statistics for R8 core endpoints as well as manhattan plot images

    • Data in green library:

      • /finngen/library-green/finngen_R8/finngen_R8_analysis_data/summary_stats

  • FinnGen R8 Fine-mapping analysis data

    • These files contain the finemapping results for R8 core endpoints, performed using SUSIE and FINEMAP

    • Data in green library:

      • /finngen/library-green/finngen_R8/finngen_R8_analysis_data/finemap

    • Documentation in green library:

      • /finngen/library-green/finngen_R8/finngen_R8_analysis_documentation/

  • FinnGen R8 Autoreporting data

    • These files contain the autoreporting summaries created from finemapping & GWAS summary statistic data

    • Data in green library:

      • /finngen/library-green/finngen_R8/finngen_R8_analysis_data/autoreporting

    • Documentation in green library:

      • /finngen/library-green/finngen_R8/finngen_R8_analysis_documentation/

  • FinnGen R8 Colocalization data

    • Colocalization between FinnGen SUSIE fine-mapping and multiple other resources

    • Data and documentation in green library:

      • /finngen/library-green/finngen_R8/finngen_R8_analysis_data/colocalization

  • This data is also available for browsing in Pheweb: https://results.finngen.fi

15 November 2021

  • The covariates and endpoints used in R8 core analyses.

  • R8_COV_PHENO_V4_1.txt.gz and R8_COV_PHENO_V4_1.FID.txt.gz contain the same data with different IDs

  • Data location in Sandbox:

    • /finngen/library-red/finngen_R8/analysis_covariates/finngen_R8_cov_1.0.txt.gz

    • /finngen/library-red/finngen_R8/analysis_covariates/finngen_R8_COV_PHENO_V4_1.txt.gz

    • /finngen/library-red/finngen_R8/analysis_covariates/finngen_R8_COV_PHENO_V4_1.FID.txt.gz

15 November 2021

  • Finngen R8 cluster plot data

  • These data files contain genotype intensities per variant for a subset of samples that are used in cluster plots (released in the green library).

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/cluster_plot_1.0/

  • See the detailed path of this data from manifest:

    • /finngen/library-red/finngen_R8/cluster_plot_1.0/manifest.txt

4 November 2021

  • FinnGen R8 PRS data

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/prs_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

25 October 2021

  • FinnGen R8 kidney disease register data

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/kidney_disease_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

20 October 2021

  • GRM, bgen chunks, and analysis covariates for custom analyses

  • Data and documentation locations in Sandbox:

    • library-red/finngen_R8/grm_1.0/

    • library-red/finngen_R8/bgen_1.0_20k_chunks

    • library-red/finngen_R8/analysis_covariates/

13 October 2021

  • FinnGen R8 phenotype data (version 4.0)

  • This data has been created with an updated version of Endpointter and new definition files. These changes corrected the controls of some endpoints.

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/phenotype_4.0

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

7 October 2021

  • tsv file mapping alleles between SISuv3 and SISuv4 panel released to green library

  • The file is available in both v3 and v4 panel subfolders (the file is identical in both locations)

  • Location in v3 panel subfolders

    • in Google cloud: gs://finngen-production-library-green/imputation_panel/v3/annotation/map_v3_v4_alleles_v2.tsv

    • in Sandbox: /finngen/library-green/imputation_panel/v3/annotation/map_v3_v4_alleles_v2.tsv

  • Location in v4 panel subfolders

    • in Google cloud: gs://finngen-production-library-green/imputation_panel/v4/annotation/map_v3_v4_alleles_v2.tsv

    • in Sandbox: /finngen/library-green/imputation_panel/v4/annotation/map_v3_v4_alleles_v2.tsv

7 October 2021 (edit: Aug 2024 - meta-analysis results with Estonian biobank are no longer available)

  • The meta-analysis results between FinnGen (release 7), UKBB and Estonia Biobank for 50 phenotypes have been released in the green library:

    • Data location in Google cloud: gs://finngen-production-library-green/finngen_R7/finngen_R7_analysis_data/ukbb_estbb_meta/

    • Data location in Sandbox: /finngen/library-green/finngen_R7/finngen_R7_analysis_data/ukbb_estbb_meta/

  • The results include summary statistics from meta-analysis (including leave-one-out results) and autoreporting results. Please see the readme for more information:

    • Readme location in Google cloud: gs://finngen-production-library-green/finngen_R7/finngen_R7_analysis_data/ukbb_estbb_meta/README.md

    • Readme location in Sandbox: /finngen/library-green/finngen_R7/finngen_R7_analysis_data/ukbb_estbb_meta/README.md

6 October 2021

  • Finnish nationwide breast and cervical cancer screening data (cancer_screening_1.0) and

  • Detailed version of Finnish cancer register data (cancer_detailed_1.0)

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/cancer_screening_1.0

    • /finngen/library-red/finngen_R8/cancer_detailed_1.0

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

4 October 2021

  • FinnGen R8 imputed STR data (version 1.0)

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/imputed_str_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

30 September 2021

  • FinnGen R8 infectious disease register corona data (version 2.0)

  • Data contains 5372 (cumulative number) corona virus positive FinnGen participants

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/infectious_disease_register_corona_2.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

29 September 2021

  • FinnGen R8 vaccination register data (version 1.0)

  • Vaccination data of 249 066 FinnGen participants

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/vaccination_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

23 September 2021

  • FinnGen R8 reproductive history data (version 1.0)

  • Reproductive history of 151 109 FinnGen mother participants

  • The data combines information from population register (DVV) (since 1953) and medical birth register (since 1987).

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/birth_and_dvv_register_1.0/

  • Detailed paths to library red data from catalog:

    • /finngen/library-red/finngen_R8/catalog/catalog.txt

22 September 2021

  • FinnGen R8 phenotype 3.0 data

  • Created using an updated version of Endpointter.

  • Years 2020 and 2021 added to Hilmo registers (preliminary information).

  • Data and documentation location in Sandbox:

    • /finngen/library-red/finngen_R8/phenotype_3.0/

6 September 2021

  • Finngen R8 genotype plink conversion data (version 1.0) and

  • PCA-kinship data (version 1.0)

  • Data and documentations location in Sandbox:

    • R8 genotype plink data and documentation: /finngen/library-red/finngen_R8/genotype_plink_1.0/

    • R8 PCA data and documentation:/finngen/library-red/finngen_R8/pca_1.0/

    • R8 prune data and documentation:/finngen/library-red/finngen_R8/prune_1.0/

    • R8 kinship data and documentation: /finngen/library-red/finngen_R8/kinship_1.0

6 September 2021

  • FinnGen R8 detailed longitudinal phenotype data

  • Data location in Sandbox: /finngen/library-red/finngen_R8/phenotype_2.0

  • New: codes with <5 cases have not been removed from this data

30 August 2021

  • FinnGen R8 infectious disease register corona (version 1.0) data

  • This data contains 4808 (cumulative number) corona virus positive FinnGen participant

  • Data location in Sandbox: /finngen/library-red/finngen_R8/infectious_disease_register_corona_1.0/

27 August 2021

  • FinnGen release 8 genotype and phenotype data

  • Finngen R8 data was imputed using SISuV4 reference panel that contains 8,554 high coverage [25x] WGS Finnish individuals.

  • The current data statistics are: Number of individuals with genotypes = 356 213

    Number of individuals with endpoints = 356 077 Number of imputed variants = 20 175 454 Number of endpoints = 4228

  • Genotypes Data & Documentation location in Sandbox: /finngen/library-red/finngen_R8/genotype_1.0/

  • Phenotypes Data & Documentation location in Sandbox: /finngen/library-red/finngen_R8/phenotype_2.0/ (note that phenotype_1.0 was an internal release only)

  • See the detailed paths to library red data from catalog: /finngen/library-red/finngen_R8/catalog/catalog.txt

29 June 2021

  • FinnGen R6 mosaic chromosomal alteration (version 1.0)

  • Data location in Sandbox: /finngen/library-red/finngen_R6/mca_1.0/

22 June 2021

  • FinnGen R7 Colocalization

  • Data location in Google cloud: gs://finngen-production-library-green/finngen_R7/finngen_R7_analysis_data/colocalization/

  • Data location in Sabox: /finngen/library-green/finngen_R7/finngen_R7_analysis_data/colocalization/

22 June 2021

  • FinnGen R7 Autoreporting

  • Data location in Google cloud: gs://finngen-production-library-green/finngen_R7/finngen_R7_analysis_data/autoreporting/

  • Data location in Sabox: /finngen/library-green/finngen_R7/finngen_R7_analysis_data/autoreporting/

22 June 2021

  • FinnGen R7 finemapping results

  • Data location in Google cloud: gs://finngen-production-library-green/finngen_R7/finngen_R7_analysis_data/finemap/

  • Data location in Sandbox: /finngen/library-green/finngen_R7/finngen_R7_analysis_data/finemap/

3 June 2021

  • FinnGen R7 infectious disease register corona (version 4.0)

  • This data contains 3496 (cumulative number) corona virus positive FinnGen participants

  • Data location in Sandbox: /finngen/library-red/finngen_R7/infectious_disease_register_corona_4.0/

10 May 2021

  • FinnGen R7 infectious disease register corona (version 3.0)

  • This data contains 3356 (cumulative number) corona virus positive FinnGen participants.

  • Data location in Sandbox: /finngen/library-red/finngen_R7/infectious_disease_register_corona_3.0/

3 May 2021

  • FinnGen R7 kidney disease register data (version 1.0)

  • Data location in Sandbox: /finngen/library-red/finngen_R7/kidney_disease_register_1.0

30 April 2021

  • FinnGen R7 detailed cancer data (version 1.0)

  • Data location in Sandbox: /finngen/library-red/finngen_R7/cancer_detailed_1.0/

29 April 2021

  • FinnGen R7 parental endpoint data (version 2.0)

  • Data locationin Sandbox: /finngen/library-red/finngen_R7/parental_endpoint_2.0

27 April 2021

  • FinnGen R7 prs data (version 1.0)

  • Data location in Sandbox: /finngen/library-red/finngen_R7/prs_1.0/

27 April 2021

  • FinnGen R7 parental endpoint data (version 1.0)

  • Data locationin Sandbox: /finngen/library-red/finngen_R7/parental_endpoint_1.0

26 April 2021

  • FinnGen R7 GRM data (version 1.0)

  • Data location in Sandbox: /finngen/library-red/finngen_R7/grm_1.0

20 April 2021

  • FinnGen R7 cov_pheno and cov data (version 1.0)

  • Data location in Sandbox: /finngen/library-red/finngen_R7/phenotype_4.0/

20 April 2021

  • Finngen R7 Chip data

  • In addition to the normal release dataset, it also includes high-quality filtered merged data.

  • Data location in Sandbox: /finngen/library-red/finngen_R7/chipd_1.0/

9 April 2021

  • FinnGen R7 corona data (version 2.0)

  • This data contains 2983 (cumulative number) corona virus positive FinnGen participants.

  • Data location in Sandbox: /finngen/library-red/finngen_R7/infectious_disease_register_corona_2.0/

30 March 2021

  • FinnGen R7 vaccination register data (version 1.0)

  • The data contains vaccination data of 201 707 FinnGen participants

  • Data location in Sandbox: /finngen/library-red/finngen_R7/vaccination_register_1.0/

21 March 2021

  • FinnGen R7 phenotype data (version 4.0) to Sandbox

  • Number of endpoints: 4137 Number of FinnGen IDs: 321 302. Updated control definition file and made & AND rule and NEVT bug fixes to Endpointter

  • Data location in Sandbox: /finngen/library-red/finngen_R7/phenotype_4.0

12 March 2021

  • FinnGen R7 reproductive history data (version 1.0)

  • The data contains reproductive history of 137 713 FinnGen mother participants. The data combines information from population register (DVV) (since 1953) and Medical birth register (since 1987).

  • Data location in Sandbox: /finngen/library-red/finngen_R7/birth_and_dvv_register_1.0

12 March 2021

  • FinnGen R7 phenotype data (version 3.0)

  • We have used updated endpoint and control definition files and made a few bug fixes to Endpointter. We have also filtered out some negative ages that were present in the previous first event file. Number of endpoints: 4137 Number of FinnGen IDs: 321 302

  • Data location in Sandbox: /finngen/library-red/finngen_R7/phenotype_3.0

9 March 2021

  • FinnGen infectious disease register data (corona, version 1.0) released to Sandbox.

  • This data consists of 2560 (cumulative number) corona virus positive FinnGen participants.

  • Data location in Sandbox: /finngen/library-red/finngen_R7/infectious_disease_register_corona_1.0/

5 March 2021

  • FinnGen visual impairment register data released to Sandbox.

  • Data location in Sandbox: /finngen/library-red/finngen_R7/visual_impairment_register_1.0

2 March 2021

  • FinnGen parental causes of death data released to Sandbox

  • Data location in Sandbox: /finngen/library-red/finngen_R7/parental_causes_of_death_1.0/

2 March 2021

  • Finngen R7 bgen data

  • Data location in Sandbox: /finngen/library-red/finngen_R7/bgen_2.0/

26 February 2021

  • Finnish nationwide breast and cervical cancer screening

  • Data location in Sandbox: /finngen/library-red/finngen_R7/cancer_screening_1.0

24 February 2021

  • Finngen R7 bgen data

  • Data location in Sandbox: /finngen/library-red/finngen_R7/bgen_1.0/

23 February 2021

  • Plink converted DF7 genotypes

  • Data location in Sandbox: /finngen/library-red/finngen_R7/genotype_plink_2.0

19 February 2021

  • DF7 phenotypes

  • Statistics for genotype_2.0 and phenotype_2.0 are: Number of endpoints: 4 145 Number of FinnGen IDs in phenotype data: 321 302 Number of FinnGen IDs in genotype data: 321 464

  • Data location in Sandbox: /finngen/library-red/finngen_R7/phenotype_2.0

5 February 2021

  • FinnGen R6 corona data (version 7.0)

  • This data contains 1515 (cumulative number) corona virus positive FinnGen participants.

  • Data location in Sandbox: /finngen/library-red/finngen_R6/corona_7.0/

12 January 2021

  • FinnGen R6 corona data (version 6.0)

  • Data location in Sandbox: /finngen/library-red/finngen_R6/corona_6.0/

PreviousData Releases 2022NextTool Catalog

Last updated 8 months ago

Was this helpful?

A description for this data can be found in .

This data contains information from the Finnish Register of Visual impairment. Data description can be found in .

Other registry data files
Other registry data files