# BigQuery Python - Case Study - Comorbidity - Upset plot

In this we detail a scenario for how you can **plot comorbidities** of a FinnGen endpoint.

**Location of the script**

`/finngen/library-green/scripts/code_snippets/codeSnippet_comorbidities_endpoint.py`

You can copy paste from below explanation or take the code directly from the file itself.

For F5\_ALZHDEMENT endpoint, comorbidities include type 2 diabetes (T2D), cardiovascular diseases (I9\_CVD), depression (F5\_DEPRESSIO), and gastrointestinal diseases (K11\_GIDISEASES).

We can extract the patients from these endpoints and see the overlap of F5\_ALZHDEMENT patients in comorbid endpoints.

**NOTE: THIS WILL ONLY WORK IN ANACONDA ENVIRONMENT FOR NOW. SO USE THE DOCKER IMAGE. RUN THE BELOW COMMAND FIRST AND THEN RUN PYTHON.**

```
docker run -v /home/ivm:/home/ivm -it eu.gcr.io/finngen-sandbox-v3-containers/anaconda_python/anaconda3:1.0 /bin/bash
```

The query is very simple as we have to extract patients from different endpoints. The details can be seen below

```
# 
import pandas_gbq 
import os, sys
# Run the following commands
os.system('export QT_PLUGIN_PATH=/home/ivm/anaconda3/plugins')
os.system('export FONTCONFIG_PATH=/home/ivm/anaconda3/etc/fonts')
from google.cloud import bigquery
import pandas as pd
from upsetplot import from_contents
from upsetplot import UpSet

# Connect to client
client = bigquery.Client()

# Queries
query_alzhdem = """ SELECT FINNGENID
                    FROM `finngen-production-library.sandbox_tools_r10.endpoint_cohorts_r10_v1`
                    WHERE ENDPOINT = 'F5_ALZHDEMENT'
                """
query_type2d  = """ SELECT FINNGENID
                    FROM `finngen-production-library.sandbox_tools_r10.endpoint_cohorts_r10_v1`
                    WHERE ENDPOINT = 'T2D'
                """
query_cvd     = """ SELECT FINNGENID
                    FROM `finngen-production-library.sandbox_tools_r10.endpoint_cohorts_r10_v1`
                    WHERE ENDPOINT = 'I9_CVD'
                """
query_depress = """ SELECT FINNGENID
                    FROM `finngen-production-library.sandbox_tools_r10.endpoint_cohorts_r10_v1`
                    WHERE ENDPOINT = 'F5_DEPRESSIO'
                """
query_gids    = """ SELECT FINNGENID
                    FROM `finngen-production-library.sandbox_tools_r10.endpoint_cohorts_r10_v1`
                    WHERE ENDPOINT = 'K11_GIDISEASES'
                """

# Job configuration
job_config = bigquery.QueryJobConfig()

# Run the queries
query_result_alzhdem = client.query(query_alzhdem,job_config=job_config)
query_result_type2d  = client.query(query_type2d,job_config=job_config)
query_result_cvd     = client.query(query_cvd,job_config=job_config)
query_result_depress = client.query(query_depress,job_config=job_config)
query_result_gids    = client.query(query_gids,job_config=job_config)
```

Save the results to different dataframes

```
query_result_dataframe_alzhdem = query_result_alzhdem.to_dataframe()
query_result_dataframe_type2d  = query_result_type2d.to_dataframe()
query_result_dataframe_cvd     = query_result_cvd.to_dataframe()
query_result_dataframe_depress = query_result_depress.to_dataframe()
query_result_dataframe_gids    = query_result_gids.to_dataframe()
```

Combine endpoint dataframes to get overlap of FINNGENIDs

```
comorbidEndpoints = from_contents({'AlzheimerDementia':query_result_dataframe_alzhdem['FINNGENID'].to_list(),
                                   'Type2Diabetes': query_result_dataframe_type2d['FINNGENID'].to_list(),
                                   'CardioVascualrDiseases': query_result_dataframe_cvd['FINNGENID'].to_list(),
                                   'Depression': query_result_dataframe_depress['FINNGENID'].to_list(),
                                   'GastroIntestinalDisease': query_result_dataframe_gids['FINNGENID'].to_list()})
```

UpSet plot of the FINNGENIDs overlap among the endpoints

```
# You can play with more parameters of UpSet plot
plt = UpSet(comorbidEndpoints, subset_size = 'count', show_counts = True).plot()
# Save the plot
from matplotlib import pyplot as plt
plt.savefig('/home/ivm/AlzheimerDementia_Comorbidites_UpsetPlot.png')
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/miscellaneous-helper-scripts-tools/tool-to-annotate-variants-with-rsids-1/bigquery-python-case-study-comorbidity-upset-plot.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
