# How to run survival analyses

Here we describe how to perform survival analyses (cox- models, kaplan-meiers etc.) using ready time-to-event phenotypes, and also show an example how to create your own time-to-event phenotype using bigQuery.

To run cox, model, you need **2 phenotype columns:**

1. `EVENT` column, indicating the event status (0/1), and
2. `EVENT_AGE` column, indicating the survival time (must be >0 for all samples). **NOTE:** `EVENT_AGE` is needed for **non-events as well** (for them, it can be survival time until the end of follow-up)!

The file for cox model should be in format like this:

![](https://3072695768-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MhYL0UTLjqsuIdK0SSO%2Fuploads%2Fgit-blob-5ac4c46221b5c87d81f0c2a0c8f9487827e38bac%2Fimage%20\(758\).png?alt=media)

### Run cox model for ready time-to-event phenotypes

In the [endpoint- file](https://docs.finngen.fi/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/endpoint-and-endpoint-longitudinal-data#endpoint-file), core endpoints are ready in the format to run cox model **from birth to first event of endpoint**. In this example, we will perform cox model for survival from birth until endpoint `I9_CORATHER`.

Since the endpoint file is very large file, you can first filter that for the columns needed using awk. Here is an example to do that filtering to columns (`FINNGENID`, `I9_CORATHER`, `I9_CORATHER_AGE` and `SEX` (for gender stratified analysis)) you need for example using awk:

`zcat /finngen/library-red/finngen_R11/phenotype_1.0/data/finngen_R11_endpoint_1.0.txt.gz | awk -v col1=FINNGENID -v col2=I9_CORATHER -v col3=I9_CORATHER_AGE -v col4=SEX 'NR==1{for(i=1;i<=NF;i++){if($i==col1)c1=i;if ($i==col2)c2=i;if ($i==col3)c3=i;if ($i==col4)c4=i;}} NR>=1{print $c1 " " $c2 " " $c3 " " $c4}' >I9_CORATHER_ages.txt`

Then in Rstudio, once read in the file you can perform cox model with the following command (R package [survival](https://cran.r-project.org/web/packages/survival/survival.pdf) required: `library(survival`):

`fit<-survfit(coxph(Surv(I9_CORATHER_AGE, I9_CORATHER)~1, data = d))`

You can plot corresponding Kaplan-meier plot by:

`plot(fit)`

Gender- stratified model can be done by:

`fit_gender_str<-survfit(coxph(Surv(I9_CORATHER_AGE, I9_CORATHER)~strata(SEX), data = d))`

And corresponding Kaplan-meier plot:

`plot(fit_gender_str)`

###
