> For the complete documentation index, see [llms.txt](https://docs.finngen.fi/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/miscellaneous-helper-scripts-tools/tool-to-annotate-variants-with-rsids-1/bigquery-python-case-study-patient-timeline-scatter-plot.md).

# BigQuery Python - Case Study - Patient Timeline - Scatter plot

In this we detail a scenario for how you can plot comorbidities of a FinnGen endpoint.

**Location of the script**

`/finngen/library-green/scripts/code_snippets/codeSnippet_patientTimeline.py`

You can copy paste from below explanation or take the code directly from the file itself.

Patient timeline in FinnGen from starting event recorded to the last/latest event.

The query is very simple as to extract whole information of a patient from the detailed longitudinal table. However, the detailed longitudinal table contains ICD, drug, reimbursement and other codes but no information what that code means. To get the proper mapping of what the overall theme of a each event means, we developed a code mapper SQL script with details [here](https://github.com/FINNGEN/FinnGenUtilsR/blob/develop/inst/sql/append_info_to_longitudinal_data.sql). The same SQL script is also exposed as JAVASCRIPT function **codeProcess** within Sandbox. Also, the selection of a FINNGENID is randomized just for this case study purpose.

```
# 
import os, sys
from google.cloud import bigquery
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt

# Connect to client
client = bigquery.Client()

# Query
query = """  WITH temp AS (
                     SELECT ssdl.*,
                            # The function sandbox.codeProcess returns processed CODE1, CODE2 and CODE3 along with vocabulary_id
                            sandbox.codeProcess(TO_JSON_STRING(ssdl), 'CODE1_CODE2', 'MORPO_BEH_TOPO', 'REIMB', 'ATC', 5, 5, 5, 7, 5).*
                     FROM `finngen-production-library.sandbox_tools_r10.finngen_r10_service_sector_detailed_longitudinal_v1` AS ssdl
                     # Here we are randomly selecting a FinnGen patient FINNGENID. You can replace this with a particular FINNGENID
                     WHERE FINNGENID = (
                                        SELECT DISTINCT FINNGENID
                                        FROM `finngen-production-library.sandbox_tools_r10.finngen_r10_service_sector_detailed_longitudinal_v1` 
                                        WHERE RAND() < 10/(SELECT COUNT(*) FROM `finngen-production-library.sandbox_tools_r10.finngen_r10_service_sector_detailed_longitudinal_v1`) 
                                        LIMIT 1
                                       )
                     # The results of patient are ordered by the date
                     ORDER BY ssdl.APPROX_EVENT_DAY, ssdl.SOURCE
              )
              # The processed codes are mapped to English names
              SELECT t.*,fgc.concept_class_id,fgc.name_en,fgc.name_fi
              FROM temp AS t
              LEFT JOIN `finngen-production-library.medical_codes.fg_codes_info_v1` as fgc
              ON t.vocabulary_id = fgc.vocabulary_id AND
                 t.FG_CODE1 IS NOT DISTINCT FROM fgc.FG_CODE1 AND
                 t.FG_CODE2 IS NOT DISTINCT FROM fgc.FG_CODE2 AND
                 t.FG_CODE3 IS NOT DISTINCT FROM fgc.FG_CODE3
	 """

# Job configuration
job_config = bigquery.QueryJobConfig()

# Run the queries
query_result = client.query(query,job_config=job_config)
```

Save the query result to dataframe

```
columnNames = list(list(query_result)[0].keys())
temporaryList = []
for row in query_result:
      temporaryList.append(list(row))
# Convert the temporaryList to dataframe using pandas package
query_result_dataframe = pd. DataFrame(temporaryList, columns = columnNames)
# Change the data type of date column to datetime format
query_result_dataframe['APPROX_EVENT_DAY'] = pd.to_datetime(query_result_dataframe['APPROX_EVENT_DAY'])
```

Use the patient information stored in dataframe to plot a patient timeline

```
g1 = sns.scatterplot(x="APPROX_EVENT_DAY",y="CODE1",data=query_result_dataframe,hue="vocabulary_id")
g1.set(yticklabels=[])
g1.set(ylabel=None)
g1.tick_params(left=False)
g1.legend(loc='center left', bbox_to_anchor=(1, 0.5), ncol=1)
plt.savefig('/home/ivm/patientTimeline.png')
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/miscellaneous-helper-scripts-tools/tool-to-annotate-variants-with-rsids-1/bigquery-python-case-study-patient-timeline-scatter-plot.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
