# Upload cohorts to CO

#### Tutorial

A tutorial video about CO available from [FinnGen data users meeting 25th Jan 2022](https://www.finngen.fi/en/members/recordings/finngen-data-users-meeting-25th-jan-2022) recording at 16min35sec and [FinnGen data users' meeting 28th June 2022](https://www.finngen.fi/en/members/recordings/finngen-data-users-meeting-28th-june-2022) at 28min55sec.

**Open the Cohort Operations (CO) tool** from the FinnGen Sandbox dropdown menu.

![](/files/BI0XnlOO8m3pUTHdtPAJ)

The Status page of the Cohort Operations Shiny App tool opens showing version information and the connection status to the data.

A tick mark shows that the connection has been successfully formed.

<figure><img src="/files/qRLaEd7GYilzmBy0gIFq" alt=""><figcaption></figcaption></figure>

Select `Import Cohorts` from the left side menu. Guidance for each step is available from ![](/files/PDJK4KYsyJnnBycg1FUM).

### Upload cohorts built with Atlas

#### Step 1:

To upload cohorts built in [Atlas](/working-in-the-sandbox/which-tools-are-available/atlas.md), click on `from Atlas` (1). Select the FinnGen data freeze you would like to use (2). Use the search option to find your Atlas cohorts (3). Tap the check box to select the cohorts you'd like to import (4) and click `Import Selected` to finish moving them in (5).

![](/files/cCUE37UxQM47RT3TLWnj)

#### Step 2:

After uploading, the selected cohorts should appear in the Cohorts workbench view. The cohort workbench view gives the cohort source and name, the number of case entries `n_entries`, and the number of patients `n_patients`. The sex ratio is shown with percentages given for males (in blue), gender unknown (in grey), and females (in red), respectively.

The bar plot visualizes the years persons started and ended their participation in the cohort(s).

Here we can see that the excessive earwax cases are included in the cohort later than the controls because the entry date for cases is the date when excessive earwax was first diagnosed, while for the control group the entry date is the first time a person has any record in the health registry data (this same design is discussed [here](/working-in-the-sandbox/which-tools-are-available/atlas/detailed-guide/cohort-characterizations-in-atlas.md) and [here](/working-in-the-sandbox/which-tools-are-available/atlas/detailed-guide/cohort-characterizations-in-atlas/interpreting-the-results-of-feature-analysis-in-atlas.md)).

However, the year at which people exit the cohort(s) should be similar between cases and controls.

![](/files/OyDA0Al6OIjq2AHhe6Oz)

The control cohort can be modified to include the latest date in order to make the control cohort's start date more similar compared to the cases cohort (see [Modifying Atlas cohort with CO](/working-in-the-sandbox/which-tools-are-available/cohort-operations-tool-co/modifying-atlas-cohort-with-co.md)).

### Upload cohorts from the Genotype Browser output file

The Cohort Operations tool recognizes [Genotype Browser](/working-in-the-sandbox/which-tools-are-available/genotype-browser.md) output files and reads them in without the need for any further input from the user. [Outputting genotype information using Genotype Browser](/working-in-the-sandbox/working-with-genotype-data/genotype-browser.md) is very easily done: Select your variant of interest in the browser, then click `Download data`.

**Example:** In the following example we use variant rs3091552, a C to G mutation in chromosome 20 (position 46811367), which was found to be the most significant variant detected for excessive earwax in the previous example. We viewed it with the[ Cohort Characterizations tool](/working-in-the-sandbox/which-tools-are-available/atlas/detailed-guide/cohort-characterizations-in-atlas.md).

Now back in CO, from the `Import Cohorts` page select `from File` (1), click `Browse...` and search for your Genotype Browser output file (2). The Cohort Operations tool will read in the cohort(s). Select the cohorts you would like to import (3), then click `Import Selected` (4).

![](/files/CjU8kzWiMNatoUjnUjId)

Imported Genotype Browser files appear on the Cohorts workbench in addition to any Atlas cohorts. The cohort workbench view gives the cohort source and name, the number of case entries `n_entries`, the number of patients `n_patients`, and the cohort's sex ratio.

There are no bar plots, because no cohort starts or end dates are available for Genotype Browser output files.

![](/files/-MkvW8Cv-8GSIGEcVWeA)

### Upload cohorts from a text file

To upload a cohort in CO from a tab-separated text file, the columns of the text file should be formatted as follows:

```
COHORT_SOURCE = as.character(NA),
    COHORT_NAME = as.character(NA),
    FINNGENID = as.character(NA),
    COHORT_START_DATE = lubridate::as_date(NA),
    COHORT_END_DATE = lubridate::as_date(NA),
    SEX = as.character(NA),
    BIRTH_DATE = lubridate::as_date(NA),
    DEATH_DATE = lubridate::as_date(NA)
```

The column headings should be labelled exactly as given. The first three columns `COHORT_SOURCE`, `COHORT_NAME`, and `FINNGENID` are mandatory. The first two fields will be shown in the Cohort Workbench view after the cohort(s) are uploaded to CO. In the `COHORT_SOURCE` column, users must define the source that will be repeated for each row in the column. The mandatory fields are:

```
COHORT_SOURCE = "text file"
    COHORT_NAME = c("my_cohort1", "my_cohort2", "my_cohort3")
    FINNGENID = c("FG0000001", "FG0000002", "FG0000003")
```

An example of input table format with mandatory fields (FINNGENID, COHORT\_SOURCE, and COHORT\_NAME).

| FINNGENID  | COHORT\_SOURCE | COHORT\_NAME |
| ---------- | -------------- | ------------ |
| FG00000001 | text\_file     | my\_cohort1  |
| FG00000002 | text\_file     | my\_cohort1  |
| FG00000003 | text\_file     | my\_cohort1  |
| FG00000004 | text\_file     | my\_cohort2  |
| FG00000005 | text\_file     | my\_cohort2  |
| FG00000006 | text\_file     | my\_cohort3  |

### Upload cohorts from TVT

A [tsv file exported from TVT](/working-in-the-sandbox/which-tools-are-available/trajectory-visualization-tool-tvt/exporting-cohorts-from-tvt.md) contains one FINNGENID column with a list of FinnGen IDs. In order to read TVT output file into Cohort Operations tool, two other mandatory columns are needed, COHORT\_SOURCE and COHORT\_NAME fields, as described [above](#upload-cohorts-from-a-text-file). These columns can be added e.g. with Terminal Emulator using the two following commands:

```
awk 'BEGIN{ FS = OFS = "\t" } { print $0, (NR==1? "COHORT_SOURCE" : "text_file") }' /path/to/cohort_from_TVT.tsv > tmp && mv tmp /path/to/cohort_from_TVT.tsv
awk 'BEGIN{ FS = OFS = "\t" } { print $0, (NR==1? "COHORT_NAME" : "my_TVT_cohort") }' /path/to/cohort_from_TVT.tsv > tmp && mv tmp /path/to/cohort_from_TVT.tsv
```

Where `/path/to/cohort_from_TVT.tsv` should be replaced with the file exported from TVT.

### Upload FinnGen Endpoint

To upload a FinnGen Endpoint cohort select `Import Cohorts` page in left panel, `from Endpoint` in right and use search option to find the endpoints of interest. Select the endpoints to import by tapping the type -box for endpoints you like and click `Import Selected` to import.

![](/files/P7mluEMhH0ztKOBPq8Rn)

Case, control, and excluded cohorts of the selected FinnGen endpoints will load on the Cohorts workbench.

![](/files/vRwjPewuMenPqqAnLbkW)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/cohort-operations-tool-co/upload-cohorts-to-co.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
