# Using Atlas in Sandbox

The figure below shows **the full workflow of cohort building and analyses within Sandbox**. In this tutorial we will focus on the part containing Atlas.

<figure><img src="/files/cEgNNw4tSbxmYfMMpb39" alt=""><figcaption></figcaption></figure>

Atlas has a lot of functionalities, from cohort definitions to calculating incidence rates. However, for cohort building purposes, **‘Concept Sets’** and **‘Cohort Definitions’** are usually sufficient, with the addition of **‘Characterizations’** to inspect that the built cohort is as intended.

In the case of using a readily made phenotype definition from the [OHDSI PhenotypeLibrary](https://data.ohdsi.org/PhenotypeLibrary/), you can proceed directly to the **‘Cohort Definitions’**.

<figure><img src="/files/2Sab27qaZqvgMKA03JNk" alt="" width="563"><figcaption></figcaption></figure>

{% hint style="info" %}
To use Atlas in the Sandbox, start by opening your IVM which may be of any size, including the smallest one, as Atlas does all the data fetching using BigQuery.

In Sandbox, select **Applications > Sandbox > Atlas**.
{% endhint %}

## Steps for building a cohort from FinnGen data in Atlas

### **1. Create ‘Concept Sets’**

These can be for instance medical codes (ICD, SNOMED) or drug purchases (ATC, VNRfi, RxNorm).

* Search for the concept (medical code, drug purchase etc.) using the *‘Search’* function: either as strings, as ICD codes, as SNOMED codes, etc.
* Use the *‘Descendants’* tick box to include sub codes of a diagnosis/medication main code.
* Use the *‘Exclude’* tick box to exclude specific sub codes from the concept set you are creating.
* International <mark style="color:blue;">standard codes</mark>, such as SNOMED codes, are displayed in blue color and local <mark style="color:red;">non-standard codes</mark>, such as ICD codes, are displayed in red color.

{% hint style="warning" %}
**You should not mix&#x20;**<mark style="color:blue;">**standard**</mark>**&#x20;and&#x20;**<mark style="color:red;">**non-standard**</mark>**&#x20;codes in a single concept set.** This is because concept sets including standard codes can be uploaded in the **‘Cohort Definitions’** directly while concept sets including non-standard codes need to be added as an attribute as *‘source’* concepts. If you mix standard and non-standard codes into one concept set, you will have to choose whether to upload the concept set in the cohort definition directly or via attribute as a source concept, and then Atlas will search for individuals from only one type of code, standard or non-standard, not both, depending on how you uploaded the concept set. Therefore, if you need both standard and non-standard codes, put them into separate concept sets and upload separately in **‘Cohort Definitions’**.
{% endhint %}

{% hint style="warning" %}
Note that ATC codes are in the *’Standard concept’* classification neither standard nor non-standard but in their own category <mark style="color:purple;">classification</mark> and shown in purple color. When inputting them in the **‘Cohort Definitions’**, they can be treated similarly to standard codes.
{% endhint %}

* For standard codes, explore the *‘Hierarchy’* to see the *‘Parents’* and *‘Children’* (i.e. *‘Descendants’*) of the code and to help you decide which code to select. For non-standard codes this is not available.
* Columns RC, DRC, PC, DPC refer to record count, descendant record count, person count, and descendant person count, respectively. A single concept with hierarchy, e.g. an ICD-10 code A10, will have descendant records and persons for the sub codes A10.0, A10.1 and A10.5, for instance. Record and person counts of all these subcodes are included in DRC and DPC. It is possible to have 0 records/persons for the main code, but descendant records/persons for the subcodes. It is good practice to sort by RC to see in which codes there are records in FinnGen data.

### **2. Create 'Cohort Definitions'**

Use the **‘Concept Sets’** in **‘Cohort Definitions’** to define the *‘Cohort Entry Events’,* *‘Inclusion Criteria’* and *‘Cohort Exit’*. You will need to build separate cohorts for cases and controls.

* In the *‘Define’* tab give a name to your cohort and add definitions for the *‘Cohort Entry Events’*, *‘Inclusion Criteria’* and *‘Cohort Exit’.*
* *‘Cohort Entry Events’* defines **the starting point for the cohort**
  * **For case cohort**: can be anything from the Atlas dropdown menu *‘Add initial event’*, e.g. first diagnosis (*‘Add Condition Occurrence’*), drug purchase (*‘Add Drug Exposure’*), etc. The entry to the cohort **should be clearly defined**, avoiding entries such as *‘Any Visit Occurence’* without a specification.
  * **For control cohort:** usually *‘Any Visit Occurrence’* meaning the entry to any of the registers since they are a group of people with no conditions.

{% hint style="warning" %}
Concept sets based on <mark style="color:red;">**non-standard**</mark>**&#x20;codes need to be imported as source concepts:** click the *‘Add attribute’* and use the relevant *‘Source Concept Criteria’*.
{% endhint %}

{% hint style="warning" %}
By default, **the codes are searched from all the available FinnGen registers in Atlas**. If you want to filter by a specific register, you can use the readily made concept sets for different registers (search for ‘FinnGen support concept set’) and filter for them in the *‘Cohort Entry Events’* (see [Examples](/working-in-the-sandbox/which-tools-are-available/atlas/quick-guide/examples-on-cohort-building-with-atlas.md)).
{% endhint %}

* *‘Inclusion Criteria’* defines **the inclusion to the cohort more specifically**, e.g. by number of drug purchases, etc.
  * To create a cohort based on **multiple concepts**, e.g. conditions and drugs, in the *‘Inclusion Criteria’* box above all the criteria you have added, there is a dropdown menu to choose from whether the inclusion is based on *all*, *any*, *at least* or *at most* of the criteria.

{% hint style="warning" %}
Concept sets based on <mark style="color:red;">**non-standard**</mark>**&#x20;codes need to be imported as source concepts:** click the *‘Add attribute’* and use the relevant *‘Source Concept Criteria’*.
{% endhint %}

* *‘Cohort Exit’* defines **when a person exits a cohort**
  * Usually the default given by Atlas is sufficient
  * Modify this if you want to create a cohort where a person can enter more than once,e.g. with multiple fractures.
* **Creating a control cohort**:
  * Copy the case cohort
  * Adjust the *‘Cohort Entry Events’* to *‘Any Visit Occurrence’* if appropriate
  * In the *‘Inclusion Criteria’*, adjust any condition/drug purchase to *exactly 0* occurrences or delete completely
  * Add any new inclusion criteria, e.g. the controls may need to be free of some other conditions.
* **Creating a cohort by exporting a JSON code e.g. from** [**OHDSI PhenotypeLibrary**](https://data.ohdsi.org/PhenotypeLibrary/)**:**
  * Use the *‘Export’* tab and select the *‘JSON’* button
  * Paste the JSON code from Sandbox Clipboard – if needed, in small chunks
  * Click the ‘*Reload’* button at the bottom of the screen. The cohort definitions should have appeared in the *‘Define’* tab
* **Final step:** go to the *‘Generate’* tab and choose the FinnGen data release in which you would like to generate the cohort.

{% hint style="info" %}
View the report for the number of individuals included in the cohort. This is **an essential step** because without successful cohort generation the cohort cannot be found and applied to further analyses.
{% endhint %}

* **Using existing FinnGen endpoints**: use the **Cohort Operations tool** in Sandbox, where endpoint cohorts can be imported directly from the *‘Endpoint’* tab

### **3. Use 'Characterizations'**

Inspect the cohorts by using the Atlas function **‘Characterizations’** and/or the separate **Cohort Operations tool** and/or other tools in Sandbox.

* In the Atlas **‘Characterizations’**, import the case and control cohorts using the *‘Design’* tab and next, choose the features that you want to characterize in each cohort, for example age and gender
* In the *‘Executions’* tab, generate the report in your preferred FinnGen data release and view the report directly there

### Summary of the Atlas terminology in terms of FinnGen data

| Atlas terminology                                                           | Used terms in FinnGen data                                                                                                                                                                                                                                                                                                                                                                                        |
| --------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Standard concept: <mark style="color:blue;">standard</mark> (international) | SNOMED, LOINC, RxNorm                                                                                                                                                                                                                                                                                                                                                                                             |
| Standard concept: <mark style="color:red;">non-standard</mark> (local)      | ICD8fi, ICD9fi, ICD10fi, ICD10, ICPC, NCSPfi, VNRfi                                                                                                                                                                                                                                                                                                                                                               |
| Standard concept: <mark style="color:purple;">classification</mark>         | ATC                                                                                                                                                                                                                                                                                                                                                                                                               |
| Concept Set                                                                 | A set of codes based on a diagnosis, drug purchase, drug reimbursement, etc. Each set is based either on standard or non-standard codes but not on both. E.g. a concept set 1 on disease X based on ICD codes or a concept set 2 on disease X based on SNOMED codes. The concept sets will be used in the **‘Cohort Definitions’** to define *‘Cohort Entry Events’*, *‘Inclusion Criteria’* and *‘Cohort Exit’*. |
| Concept Set: Descendants                                                    | Descendants are the sub codes of ICD or ATC codes, e.g. A10.1.                                                                                                                                                                                                                                                                                                                                                    |
| Concept Set: RC, DRC, PC, DPC                                               | Record count (RC) and person count (PC) refer to the counts for main codes, e.g. for ICD-10 code A10, whereas descendant record count (DRC) and descendant person count (DPC) refer to the counts for the sub codes, e.g. ICD-10 code A10.1.                                                                                                                                                                      |

<br>

\ <br>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/atlas/quick-guide/using-atlas-in-sandbox.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
