# Clustering timelines with TVT

The aim of clustering is to visualize if the data have clear **clusters of patients with similar medical histories**.

#### TVT will cluster patients according to:

* event counts of user-defined glyph segments
* age
* gender

To define the timeline for event count calculations, the data for all patients starts at the time when they were first included in the cohort (indicated for each person by a solid line in the [Timeline Viewer](/working-in-the-sandbox/which-tools-are-available/trajectory-visualization-tool-tvt/trajectory-visualization-tool-tvt/viewing-tvt-results.md)).

The *length* of the data is defined by the person with the longest follow-up time. For the patients with shorter follow-up times, zeroes have been added at the end of the data.

#### Clustering

Clustering of the timelines is available from the `Clustering` menu on the left in the Trajectory Visualization Tool.

![](/files/MTvnSrAVjkOlwSIAHq6L)

#### Distance metric

Select the distance metric you like from the drop-down menu. There are seven metrics to choose from: *Correlation, Manhattan, Euclidean, Maximum, Canberra, Binary,* and *Minkowski.*

![](/files/veJOYUaUgADUg5y9Dfmv)

#### Clustering method

Clustering method indicates the algorithm the clustering is done with.

![](/files/GXyJCvLDi4D2ugWwcevj)

#### Number of clusters

Next, select the *number of clusters*. The aim is to find a number that provides the clearest clusters of patients with similar even counts of glyph segments, age, and gender.

The optimal number of clusters depends on your data and variables. You may numbers from 0 to 10 to find the value that fits your data best. This will define the number of clusters in the HeatMap presentation (see below).

After clustering settings are done, click `Clusters` to start clustering with the selected Distance metrics.

![](/files/Y2THxd2IbS3VGrXeyitI)

#### Clusters

The generated histograms show the distribution of cases in clusters for each glyph segment. The number of clusters in the Clusters presentation is defined by the *Number of clusters*.

![](/files/7Xd3zvqcHSDUxK92Y7NC)

Under Clusters is the Age and Details sections. **Age** highlights the distribution of entering ages, exit ages and duration in the data. **Details** gives some more in-depth information about the data.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/trajectory-visualization-tool-tvt/trajectory-visualization-tool-tvt/clustering-timelines-with-tvt.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
