# BigQuery Connection – R

This is a short tutorial on **how to use BigQuery SQL to access the FinnGen register data in the sandbox using R.** Relational databases have many benefits over flat text files. For the end-user, the main benefit is the quick and optimized execution of complex and large data wrangling tasks using a simple language, the SQL. We can process large amounts of data without need for large amounts of memory or computing power as the server-side backend will do all the heavy-lifting. In FinnGen, reading the full detailed longitudinal data from a text file takes about half an hour, and requires the virtual machine with the largest memory capacity. In contrast, if you use BigQuery, you can filter and re-structure your data with SQL, and your R program receives swiftly just the data it needs, and the memory requirement depends on your data subset.

## Initalizing BigQuery connection

Suppose you'd like to know how many rows there are in the longitudinal data table. The following simple R program will output a single row, single column data frame containing the number of rows.

```r
library(bigrquery)

# projectid is your sandbox name (check the URL in your web browser)
# please see gargle and bigrquery documentation for the next two settings
projectid <- "fg-production-sandbox-4"
options(gargle_oauth_cache = FALSE)
bq_auth(scopes = "https://www.googleapis.com/auth/bigquery")

# SQL query as a multiline character string
sql <- paste0(
"SELECT COUNT(*) ",
    "FROM finngen-production-library.",
    "sandbox_tools_r10.finngen_r10_service_sector_detailed_longitudinal_v2"
    
# execute the query, download the result
tb <- bq_project_query(projectid, sql)
df <- bq_table_download(tb)

print(df)
```

To see which tables are available in BigQuery use the following.

```
sql<-"
SELECT table_schema, column_name, data_type
FROM finngen-production-library.sandbox_tools_r10.INFORMATION_SCHEMA.COLUMNS
WHERE table_name = 'finngen_r10_minimum_v1'
LIMIT 50
"

tb <- bq_project_query(projectid, sql)
df <- bq_table_download(tb)

df 
```

**More examples and detailed tutorial script's location in the Sandbox**

Below is a path to the R Markdown file in Sandbox. This R Markdown file can be converted to an html page using "knit" command. The R Markdown is a plain text file, so you can copy code snippets that you need.

`/finngen/library-green/scripts/code_snippets/BigQueryTemplates_release.Rmd`
