# How to run survival analyses

Here we describe how to perform survival analyses (cox- models, kaplan-meiers etc.) using ready time-to-event phenotypes, and also show an example how to create your own time-to-event phenotype using bigQuery.

To run cox, model, you need **2 phenotype columns:**

1. `EVENT` column, indicating the event status (0/1), and
2. `EVENT_AGE` column, indicating the survival time (must be >0 for all samples). **NOTE:** `EVENT_AGE` is needed for **non-events as well** (for them, it can be survival time until the end of follow-up)!

The file for cox model should be in format like this:

![](/files/FqIF6FiykIPGCV2jeDqc)

### Run cox model for ready time-to-event phenotypes

In the [endpoint- file](/finngen-data-specifics/red-library-data-individual-level-data/what-phenotype-files-are-available-in-sandbox-1/endpoint-and-endpoint-longitudinal-data.md#endpoint-file), core endpoints are ready in the format to run cox model **from birth to first event of endpoint**. In this example, we will perform cox model for survival from birth until endpoint `I9_CORATHER`.

Since the endpoint file is very large file, you can first filter that for the columns needed using awk. Here is an example to do that filtering to columns (`FINNGENID`, `I9_CORATHER`, `I9_CORATHER_AGE` and `SEX` (for gender stratified analysis)) you need for example using awk:

`zcat /finngen/library-red/finngen_R11/phenotype_1.0/data/finngen_R11_endpoint_1.0.txt.gz | awk -v col1=FINNGENID -v col2=I9_CORATHER -v col3=I9_CORATHER_AGE -v col4=SEX 'NR==1{for(i=1;i<=NF;i++){if($i==col1)c1=i;if ($i==col2)c2=i;if ($i==col3)c3=i;if ($i==col4)c4=i;}} NR>=1{print $c1 " " $c2 " " $c3 " " $c4}' >I9_CORATHER_ages.txt`

Then in Rstudio, once read in the file you can perform cox model with the following command (R package [survival](https://cran.r-project.org/web/packages/survival/survival.pdf) required: `library(survival`):

`fit<-survfit(coxph(Surv(I9_CORATHER_AGE, I9_CORATHER)~1, data = d))`

You can plot corresponding Kaplan-meier plot by:

`plot(fit)`

Gender- stratified model can be done by:

`fit_gender_str<-survfit(coxph(Surv(I9_CORATHER_AGE, I9_CORATHER)~strata(SEX), data = d))`

And corresponding Kaplan-meier plot:

`plot(fit_gender_str)`

###


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/running-analyses-in-sandbox/how-to-run-survival-analyses.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
