# Red data users

### What is <mark style="color:red;background-color:red;">red data</mark>?

The FinnGen data is constructed from the Finnish health and laboratory registries combined with the individual genotype information. In FinnGen, we refer to all sensitive individual-level data as "<mark style="color:red;background-color:red;">**red**</mark>" to emphasize the need for extra care and security when handling it. The <mark style="color:red;background-color:red;">**red data**</mark> is accessible within the [Sandbox environment](https://docs.finngen.fi/faq/about-sandbox), where all the individual-level genotype, phenotype and other omics data can be found. The <mark style="color:red;background-color:red;">**red data**</mark> is pseudo anonymized (or pseudonymized), which means, there is no direct identifying information to the study subjects in the data.

### How to access <mark style="color:red;background-color:red;">red data</mark>?

<mark style="color:red;background-color:red;">**Red data**</mark> is located in the [Sandbox](https://docs.finngen.fi/faq/about-sandbox) cloud computing environment and which researchers can use to run their own analyses. To get access to the red FinnGen data, see the [FinnGen access and accounts](https://docs.finngen.fi/faq/about-finngen-access-and-accounts). Only FinnGen partner [organization](https://www.finngen.fi/en/partners) affiliates can get access to the <mark style="color:red;background-color:red;">**red data**</mark> in the Sandox.  Approval to access takes 1-2 months. &#x20;

In FinnGen, <mark style="color:red;background-color:red;">**red data**</mark> is securely protected to prevent unauthorized access, loss, or damage. Access to the Sandbox is granted only after proper paperwork and passing a security exam. Your FinnGen account must have two-factor authentication (2FA) enabled to log in. For questions or concerns about data protection, contact the FinnGen Data Protection Officer at <dpo-finngen@helsinki.fi> or phone: +358 2941 24317 (mobile: +358 50 4793618).

Once granted access to FinnGen <mark style="color:red;background-color:red;">**red data**</mark>, an interactive virtual machine (IVM) will be created for you within your organization's Sandbox. This IVM, accessible via a web browser, is a Unix machine with a graphical interface hosted in the Google Cloud.&#x20;

### Using <mark style="color:red;background-color:red;">red data</mark>

To conduct research in the [FinnGen Sandbox](https://docs.finngen.fi/faq/about-sandbox), it is essential to adhere to the [FinnGen Scientific Plan](https://www.finngen.fi/en/members/document/218) and its amendments ([FinnGen 2 Scientific Plan](https://www.finngen.fi/en/members/document/217) and[ FinnGen 3 Scientific Plan](https://www.finngen.fi/en/members/document/1354)). In summary, the main goal of FinnGen is to better understand how health and disease change over time, interpret genetic signals, and develop personalized medicine and new analytical methods. The FinnGen scientific director and scientific committee oversees research using <mark style="color:red;background-color:red;">**red data**</mark> through the [FinnGen analysis proposal](https://docs.finngen.fi/finngen-data-specifics/about-analysis-proposals). An analysis proposal is not mandatory for operating within the Sandbox, but it is required to download results. Please also familiarize yourself also with FinnGen guidelines regarding [1-year exclusivity period](https://docs.finngen.fi/publishing-finngen-results/genemal-policy-of-1-yr-exclusivity-period) policy and [citing](https://docs.finngen.fi/faq/about-public-releases) guidelines.

FinnGen aims to group similar research under a single analysis proposal, granting only one analysis right for similar studies. You can check the active analysis proposals in the FinnGen appsheet, and then apply for the [analysis proposal. ](https://docs.finngen.fi/finngen-data-specifics/about-analysis-proposals)

If you suspect a data breach, report it immediately using[ the online reporting](https://elomake.helsinki.fi/lomakkeet/103627/lomake.html) form available in the members’ area or by [contacting the DPO directly](https://docs.finngen.fi/data-protection-and-security). Examples of data breaches include unauthorized access to <mark style="color:red;background-color:red;">**red data**</mark>, the <mark style="color:red;background-color:red;">**red data**</mark> outside FinnGen Sandbox, and sharing the <mark style="color:red;background-color:red;">**red data**</mark>  in presentations or manuscripts. If you lose your @finngen.fi credentials or suspect they have been compromised, contact <finngen-servicedesk@helsinki.fi> immediately. Also, contact the service desk when you no longer need your account.

By following these guidelines, you can ensure your research is compliant with FinnGen's standards and secure access to necessary resources and support.

### <mark style="color:red;background-color:red;">Red data</mark> tools

The FinnGen Sandbox is a secure, scalable environment for accessing individual-level data. It operates in a[ web browser](https://sandbox.finngen.fi/) or via[ an application](https://finngen.gitbook.io/finngen-handbook/working-in-the-sandbox/quirks-and-features/using-sandbox-as-a-chrome-application-full-screen-mode), ensuring data security and compliance with privacy regulations. Each FinnGen partner has its own Sandbox, where members can use individual virtual machines (IVMs) for research.&#x20;

The Sandbox remains open for 24 hours by default and supports R and Python programming languages. Analyses can be run in IVMs or using [FinnGen Pipelines ](https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/pipelines)for large tasks. Costs vary, with GWAS runs costing 3-10 euros and storage at 0.03 € per gigabyte per month. Information on costs can be found under [Billing information and where to find more details](https://docs.finngen.fi/working-in-the-sandbox/billing-information-and-where-to-find-more-details).&#x20;

Sensitive individual level data must not be screenshotted or transferred outside the Sandbox. Text can be copied into the Sandbox but not out, ensuring data security. Data sharing within organizations is possible via the "red" bucket.

[Files can be uploaded via Google Cloud](https://finngen.gitbook.io/finngen-handbook/working-in-the-sandbox/quirks-and-features/how-to-upload-to-your-own-ivm-via-finngen-green) and downloaded after verification. Only aggregate-level <mark style="color:green;background-color:green;">green data</mark> can be exported.&#x20;

<figure><img src="https://3072695768-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MhYL0UTLjqsuIdK0SSO%2Fuploads%2FsP1oFY39GvKLb7rj84t1%2Fsandbox.png?alt=media&#x26;token=034721ee-830b-4aad-b542-8aa9be7ce724" alt=""><figcaption><p>FinnGen Sandbox architecture</p></figcaption></figure>

FinnGen data includes[ phenotype](https://finngen.gitbook.io/finngen-handbook/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/detailed-longitudinal-data),[ laboratory](https://finngen.gitbook.io/finngen-handbook/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/kanta-lab-values/data), [genomic ](https://finngen.gitbook.io/finngen-handbook/finngen-data-specifics/red-library-data-individual-level-data/genotype-data/types-of-genotype-files-available)and other omic data stored in specific directories in the Sandbox. Users with access to <mark style="color:red;background-color:red;">**red data**</mark> can find these files in the <mark style="color:red;background-color:red;">**red**</mark> and <mark style="color:green;background-color:green;">**green**</mark> libraries. Check out this short video about [FinnGen Sandbox architecture: libraries, buckets and data](https://vimeo.com/625400442/3d42458442?share=copy).

Many of the <mark style="color:red;background-color:red;">**red data**</mark> types are also available in the [FinnGen BigQuery database](https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/bigquery-relational-database), integrated with the Sandbox. This serverless data warehouse supports eg. efficient SQL queries. A list of additional tools available in the Sandbox can be found from[ here](https://finngen.gitbook.io/finngen-handbook/tool-catalog).

<br>
