BigQuery Python - Case Study - Sex different - Tornado plot
In this we detail a scenario for how you plot sex differences distributed across different age bins within a FinnGen endpoint.
Location of the script
/finngen/library-green/scripts/code_snippets/codeSnippet_sexDifference_endpoint.py
You can copy-paste from the below explanation or take the code directly from the file itself.
From the samples in the F5_ALZHDEMENT endpoint, it would be interesting to see what the age range of male and female samples looks like. The first thing is to extract the sex and age information and then proceed to plot them.
You can get the sex information of the FINNGENIDs in the column SEX from the table r10_cov_v1_fid in the dataset sandbox_tools_r10 in the project finngen-production-library. You can extract the information using the below query
# Import packages
import os, sys
from google.cloud import bigquery
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Connect to client
client = bigquery.Client()
# Run the query
query = """ WITH endpointSubSet AS(
SELECT FINNGENID
FROM `finngen-production-library.sandbox_tools_r10.endpoint_cohorts_r10_v1`
WHERE ENDPOINT = 'F5_ALZHDEMENT'
)
SELECT ESS.FINNGENID AS FINNGENID,
DL.EVENT_AGE AS EVENT_AGE,
COV.sex AS SEX
FROM endpointSubSet AS ESS
JOIN `finngen-production-library.sandbox_tools_r10.finngen_r10_service_sector_detailed_longitudinal_v1` as DL
ON ESS.FINNGENID = DL.FINNGENID
JOIN `finngen-production-library.sandbox_tools_r10.covariates_r10_v1` as COV
ON ESS.FINNGENID = COV.fid
WHERE DL.CODE1 LIKE '%F00%' # ICD code of Alzheimer's Dementia endpoint
"""
# Job configuration
job_config = bigquery.QueryJobConfig()
# Run the query
query_result = client.query(query,job_config=job_config)Save the results to a data frame
Before plotting, we need to create age bins for male and female
Plot the sex difference in different age bins
Last updated
Was this helpful?