New to FinnGen
Last updated
Was this helpful?
Last updated
Was this helpful?
FinnGen is a public-private collaborative research project initiated in 2017. The study combines genomics, omics and health data from 520,210 individuals to understand human diseases and traits. In FinnGen all research must follow the and its amendments ( and). Data use is limited to research outlined in the plan, with analyses requiring a clear scientific purpose and potential for publishable results. To read more about FinnGen, please see the and the .
Genetic data:
All 520,210 FinnGen study subjects have undergone genome-wide genotyping. About 450,000 were genotyped using , while ~70,000 "" originate primarily from the National Institute of Health and Welfare biobank samples genotyped before FinnGen, using various Illumina GWAS arrays.
To enhance utility, all samples were imputed using a (≈8,700 individuals), yielding inferred genomes with ~21 million variants per individual. All genotype data is in human genome build GRCh38/hg38.
Additionally, FinnGen includes "legacy next-generation sequencing (NGS) variant" data from and ~2,000 whole-genome sequenced (WGS) study subjects, primarily from the THL biobank.
Health register data:
comprises detailed, harmonized longitudinal records from multiple Finnish registries, capturing health events, drug purchases, and hospitalizations for all participants. Majority of the health register data is available from all 520,210 individuals. provide additional information. , obtained from Finland’s national Kanta register, includes lab test results from public and private healthcare providers.
Other phenotype and biological data:
The bulk of the data in FinnGen consists of the genotypes and the health register data which is available from all FinnGen study subjects. However, during the project timeline is expanding to generate other data types of subset of its participants including related to study subjects with particular diseases and , such as , and data.
Green data is the aggregate level data that do not contain any individual FinnGen participant information. By green data we most commonly mean results from different types of analyses, including FinnGen Core Analysis results (for instance, GWAS summary statistics) delivered to so called "green library" by the FinnGen Analysis team. We call this data “green” to refer to “safe” data from which no individuals can be identified.
FinnGen has important health and genetic information about people. Keeping this data safe is really important both for trust and GDPR reasons. Everyone at FinnGen has to make sure it stays private and secure!
FinnGen User's meetings are held once a month (usually the last week of the month) on Tuesdays at 9:05 AM - 9:55 AM (EST) / 4:05 PM - 4:55 PM (HEL) via Zoom.
The FinnGen office hours (Q&A) are held on Zoom the day after the monthly FinnGen Users' meeting. The European session is from 1-2 PM Helsinki time, and the US session is at 1–2 PM Boston time.
FinnGen data is categorized into so-called “” data that are accessible to researchers from who have requested access.
Red data is individual-level genotype or phenotype data which is located in the Sandbox cloud computing environment and which researchers can use to run their own analyses if individual-level genotypes and/or phenotypes are required as input. We call this data "red" to remind users that we always need to take extra care and security in working with this data. Each partner/research group that has a Sandbox must cover their own computing and storage costs. Information on costs can be found under .
To learn more about the green and red data and what you can do with them, please see the and user’s quick guides.
To get access to the green or red FinnGen data, see the . Approval to access the green data takes up to 7 working days and to the red data from 1 to 2 months. Green data is accessible by anyone with a @finngen.fi account and the data be downloaded directly to the user's local machine. For red data access you need access to the Sandbox in addition to having a @finngen.fi account. You are also required to take a data security exam once a year. This is to make sure the data related to the study subjects is not mishandled. Please read more in the FinnGen Handbook, section .
Downloading results from Sandbox and proceeding to publication requires submitting an . This ensures that on-going studies do not overlap and follow the FinnGen Scientific Plan. See also the requirements for in your manuscript.
FinnGen has Task Forces and Interest groups that concentrate on studying the progression of . All FinnGen Partner researchers are welcome to join the Task Forces or the Interest Groups. If you would like to join one of them please email .
Besides the extensive descriptions in the , all new users are automatically joined to the FinnGen Slack workspace, where there are multiple channels where the FinnGen community can help each other.
Finnish academics can also contact their , i.e. support persons: Jaakko Tyrmi (University of Oulu; jaakko.tyrmi@oulu.fi), Tero Sievänen (University of Jyväskylä; tero.sievanen@uef.fi), Timo Pohjonen (University of Eastern Finland; timo.pohjonen@hyvaks.fi), Vidal Fey (University of Tampere; vidal.fey@tuni.fi), and Aleksi Winstén (University of Turku; finngen-support@utu.fi).
If you would like to receive calendar invitations for the above-mentioned meetings or if you have any questions regarding FinnGen, please email .