Red data users
Last updated
Was this helpful?
Last updated
Was this helpful?
The FinnGen data is constructed from the Finnish health and laboratory registries combined with the individual genotype information. In FinnGen, we refer to all sensitive individual-level data as "red" to emphasize the need for extra care and security when handling it. The red data is accessible within the , where all the individual-level genotype, phenotype and other omics data can be found. The red data is pseudo anonymized (or pseudonymized), which means, there is no direct identifying information to the study subjects in the data.
Red data is located in the cloud computing environment and which researchers can use to run their own analyses. To get access to the red FinnGen data, see the . Only FinnGen partner affiliates can get access to the red data in the Sandox. Approval to access takes 1-2 months.
In FinnGen, red data is securely protected to prevent unauthorized access, loss, or damage. Access to the Sandbox is granted only after proper paperwork and passing a security exam. Your FinnGen account must have two-factor authentication (2FA) enabled to log in. For questions or concerns about data protection, contact the FinnGen Data Protection Officer at dpo-finngen@helsinki.fi or phone: +358 2941 24317 (mobile: +358 50 4793618).
Once granted access to FinnGen red data, an interactive virtual machine (IVM) will be created for you within your organization's Sandbox. This IVM, accessible via a web browser, is a Unix machine with a graphical interface hosted in the Google Cloud.
To conduct research in the , it is essential to adhere to the and its amendments ( and). In summary, the main goal of FinnGen is to better understand how health and disease change over time, interpret genetic signals, and develop personalized medicine and new analytical methods. The FinnGen scientific director and scientific committee oversees research using red data through the . An analysis proposal is not mandatory for operating within the Sandbox, but it is required to download results. Please also familiarize yourself also with FinnGen guidelines regarding policy and guidelines.
FinnGen aims to group similar research under a single analysis proposal, granting only one analysis right for similar studies. You can check the active analysis proposals in the FinnGen appsheet, and then apply for the
If you suspect a data breach, report it immediately using form available in the members’ area or by . Examples of data breaches include unauthorized access to red data, the red data outside FinnGen Sandbox, and sharing the red data in presentations or manuscripts. If you lose your @finngen.fi credentials or suspect they have been compromised, contact finngen-servicedesk@helsinki.fi immediately. Also, contact the service desk when you no longer need your account.
By following these guidelines, you can ensure your research is compliant with FinnGen's standards and secure access to necessary resources and support.
The FinnGen Sandbox is a secure, scalable environment for accessing individual-level data. It operates in a or via, ensuring data security and compliance with privacy regulations. Each FinnGen partner has its own Sandbox, where members can use individual virtual machines (IVMs) for research.
The Sandbox remains open for 24 hours by default and supports R and Python programming languages. Analyses can be run in IVMs or using for large tasks. Costs vary, with GWAS runs costing 3-10 euros and storage at 0.03 € per gigabyte per month. Information on costs can be found under .
Sensitive individual level data must not be screenshotted or transferred outside the Sandbox. Text can be copied into the Sandbox but not out, ensuring data security. Data sharing within organizations is possible via the "red" bucket.
and downloaded after verification. Only aggregate-level green data can be exported.
FinnGen data includes,, and other omic data stored in specific directories in the Sandbox. Users with access to red data can find these files in the red and green libraries. Check out this short video about .
Many of the red data types are also available in the , integrated with the Sandbox. This serverless data warehouse supports eg. efficient SQL queries. A list of additional tools available in the Sandbox can be found from.