# Why does my IVM freeze while loading data into R/Rstudio

A common issue for Sandbox users is that their Interactive Virtual Machine (IVM) often freezes when loading FinnGen data (such as phenotypes, summary statistics, genotypes, etc.) into R/RStudio. This happens because FinnGen data sets are large and can easily exhaust the available memory of the IVM, especially when using a Sandbox IVM with 1 CPU and 3.75 GB of memory (Sandbox "Basic machine"). Although FinnGen data is usually stored in a compressed format, it expands significantly when loaded into R/RStudio, consuming more memory than it might appear.

**Solution:** To avoid this issue, it is recommended to subset the FinnGen data before analysis, as users rarely need all diagnosis codes or all samples simultaneously. Here are some strategies to help manage this problem:

1. **Choose an Appropriate VM Size:** Select a VM configuration that suits the size of your data. Three different VM configurations are available for selection when you log in to Sandbox ([link](https://finngen.gitbook.io/finngen-handbook/working-in-the-sandbox/running-analyses-in-sandbox/managing-memory-in-sandbox-and-data-filtering-tips)).
2. **Subset the Data Before Loading:** Use shell scripting to subset the data before loading it into R/RStudio. This allows you to process the data one line at a time, minimizing memory consumption ([link](https://finngen.gitbook.io/finngen-handbook/working-in-the-sandbox/running-analyses-in-sandbox/managing-memory-in-sandbox-and-data-filtering-tips#filtering-in-terminal)).
3. **Use BigQuery to Load Data:** Start R/RStudio and load only a subset of data directly from the BigQuery database, rather than loading the entire file from the library ([link](https://finngen.gitbook.io/finngen-handbook/working-in-the-sandbox/which-tools-are-available/miscellaneous-helper-scripts-tools/bigquery-connection-r)).

**Monitoring:** It's also important to monitor memory usage in R/RStudio during your analysis. Be aware that duplicating data frames will double memory usage ([link](https://finngen.gitbook.io/finngen-handbook/working-in-the-sandbox/running-analyses-in-sandbox/managing-memory-in-sandbox-and-data-filtering-tips#memory-managing-in-rstudio)). \[Link to memory monitoring guide]


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/faq/about-sandbox/why-does-my-ivm-freeze-while-loading-data-into-r-rstudio.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
