Upload cohorts to CO
Last updated
Was this helpful?
Last updated
Was this helpful?
A tutorial video about CO available from recording at 16min35sec and at 28min55sec.
Open the Cohort Operations (CO) tool from the FinnGen Sandbox dropdown menu.
The Status page of the Cohort Operations Shiny App tool opens showing version information and the connection status to the data.
A tick mark shows that the connection has been successfully formed.
After uploading, the selected cohorts should appear in the Cohorts workbench view. The cohort workbench view gives the cohort source and name, the number of case entries n_entries
, and the number of patients n_patients
. The sex ratio is shown with percentages given for males (in blue), gender unknown (in grey), and females (in red), respectively.
The bar plot visualizes the years persons started and ended their participation in the cohort(s).
However, the year at which people exit the cohort(s) should be similar between cases and controls.
Now back in CO, from the Import Cohorts
page select from File
(1), click Browse...
and search for your Genotype Browser output file (2). The Cohort Operations tool will read in the cohort(s). Select the cohorts you would like to import (3), then click Import Selected
(4).
Imported Genotype Browser files appear on the Cohorts workbench in addition to any Atlas cohorts. The cohort workbench view gives the cohort source and name, the number of case entries n_entries
, the number of patients n_patients
, and the cohort's sex ratio.
There are no bar plots, because no cohort starts or end dates are available for Genotype Browser output files.
To upload a cohort in CO from a tab-separated text file, the columns of the text file should be formatted as follows:
The column headings should be labelled exactly as given. The first three columns COHORT_SOURCE
, COHORT_NAME
, and FINNGENID
are mandatory. The first two fields will be shown in the Cohort Workbench view after the cohort(s) are uploaded to CO. In the COHORT_SOURCE
column, users must define the source that will be repeated for each row in the column. The mandatory fields are:
An example of input table format with mandatory fields (FINNGENID, COHORT_SOURCE, and COHORT_NAME).
FG00000001
text_file
my_cohort1
FG00000002
text_file
my_cohort1
FG00000003
text_file
my_cohort1
FG00000004
text_file
my_cohort2
FG00000005
text_file
my_cohort2
FG00000006
text_file
my_cohort3
Where /path/to/cohort_from_TVT.tsv
should be replaced with the file exported from TVT.
To upload a FinnGen Endpoint cohort select Import Cohorts
page in left panel, from Endpoint
in right and use search option to find the endpoints of interest. Select the endpoints to import by tapping the type -box for endpoints you like and click Import Selected
to import.
Case, control, and excluded cohorts of the selected FinnGen endpoints will load on the Cohorts workbench.
Select Import Cohorts
from the left side menu. Guidance for each step is available from .
To upload cohorts built in , click on from Atlas
(1). Select the FinnGen data freeze you would like to use (2). Use the search option to find your Atlas cohorts (3). Tap the check box to select the cohorts you'd like to import (4) and click Import Selected
to finish moving them in (5).
Here we can see that the excessive earwax cases are included in the cohort later than the controls because the entry date for cases is the date when excessive earwax was first diagnosed, while for the control group the entry date is the first time a person has any record in the health registry data (this same design is discussed and ).
The control cohort can be modified to include the latest date in order to make the control cohort's start date more similar compared to the cases cohort (see ).
The Cohort Operations tool recognizes output files and reads them in without the need for any further input from the user. is very easily done: Select your variant of interest in the browser, then click Download data
.
Example: In the following example we use variant rs3091552, a C to G mutation in chromosome 20 (position 46811367), which was found to be the most significant variant detected for excessive earwax in the previous example. We ran the GWAS using the , and viewed it with the.
A contains one FINNGENID column with a list of FinnGen IDs. In order to read TVT output file into Cohort Operations tool, two other mandatory columns are needed, COHORT_SOURCE and COHORT_NAME fields, as described . These columns can be added e.g. with Terminal Emulator using the two following commands: