# Finemapping results format

The finemapping results come from two different finemapping methods: [FINEMAP](http://www.christianbenner.com/) and [SuSiE](https://stephenslab.github.io/susieR/).

The purpose of finemapping is to find the set of 1 or more variants most likely to be responsible for the association at that locus. This set of likely variants is referred to as a "credible set". You can read more about the motivations for finemapping in the main concepts: [Finemapping](https://docs.finngen.fi/background-reading/finemapping).

Most severe transcript is chosen by first taking the most severe among canonical protein coding transcripts, if no canonical transcript exists, then first (random) other protein coding transcript is chosen corresponding to the most severe annotation. Precedence of severity is chosen according to [Ensembl Variant Effect Predictor (VEP)](http://www.ensembl.org/info/genome/variation/prediction/predicted_data.html) default.

### Quick links to relevant formats

For easier navigation of this page, here are some quick links to the different files formats:

[**Pipeline meta-data outputs**](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#pipeline-meta-data-outputs)

* [Region status file](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#region-status-file)

[**SuSIE outputs**](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#susie-outputs)

* [PHENONAME.SUSIE.cred.bgz and PHENONAME.SUSIE\_99.cred.bgz](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#phenoname.susie.cred.bgz-and-phenoname.susie_99.cred.bgz)
* [PHENONAME.SUSIE.cred.summary.tsv, PHENONAME.SUSIE\_99.cred.summary.tsv and PHENONAME.SUSIE\_EXTEND.cred.summary.tsv](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#phenoname.susie.cred.summary.tsv-phenoname.susie_99.cred.summary.tsv-and-phenoname.susie_extend.cred)
* [PHENONAME.SUSIE.snp.bgz and PHENONAME.SUSIE\_99.snp.bgz](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#phenoname.susie.snp.bgz-and-phenoname.susie_99.snp.bgz)
* [PHENONAME.SUSIE.snp.filter.tsv, PHENONAME.SUSIE\_99.snp.filter.tsv and PHENONAME.SUSIE\_extend.snp.filter.tsv](#phenoname.susie.snp.filter.tsv-phenoname.susie_99.snp.filter.tsv-and-phenoname.susie_extend.snp.filt)

[**Finemap outputs**](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#finemap-outputs)

* [PHENONAME.FINEMAP.config.bgz](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#phenoname.finemap.config.bgz)
* [PHENONAME.FINEMAP.region.bgz](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#phenoname.finemap.region.bgz)
* [PHENOTYPE.FINEMAP.snp.bgz](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format#phenotype.finemap.snp.bgz)

### Pipeline meta-data outputs

#### Region status file

The region status file was a tab-separated file that reported which regions were sent to finemapping and if there were any problems that prevented finemapping. This file is no longer output by the currently supported finemapping workflows, but the description has been retained for legacy results. The file had the following columns:

| Column name | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| region      | The span of the region, specified in chromosomal coordinates `chromosome.start-end`                                                                                                                                                                                                                                                                                                                                                                                                           |
| status      | Status of the region, either "OK" if the region was passed on to finemapping, or "Failure" if the region was not successfully formed.                                                                                                                                                                                                                                                                                                                                                         |
| windowsize  | The window size when determining a region. Region selection works by extending a window (in basepairs) around each genome-wide significant variable. If windows overlap each other, those windows get merged. These possibly merged windows are the resulting regions that are finemapped. In case a region is larger than the maximum allowed region size (currently 6 megabases), that region is retried with a smaller window. The final window size that is tried is the one showed here. |
| failure     | Empty if the region was successful. In case the region was not successful, the reason will read here. Most likely the region was too long, and it could not be formed even when lowering the window size to its minimum value.                                                                                                                                                                                                                                                                |

Regions were typically skipped if their merged size (after combining with proximal regions) was greater than the user-specified maximum allowed size (default 6Mb) and could not be successfully shrunk to individual regions >1Mb in size using [this algorithm](https://docs.finngen.fi/working-in-the-sandbox/which-tools-are-available/untitled/finemapping-of-custom-gwas-analyses).

### [SuSiE](https://stephenslab.github.io/susieR/) outputs

Both 95% credible set and 99% credible sets are provided. The file with \_99 contains 99% credsets as below. The SuSiE outputs have been annotated using the variant annotation file. &#x20;

#### PHENONAME.SUSIE.cred.bgz and PHENONAME.SUSIE\_99.cred.bgz

These files contains all of the credible sets for this phenotype. The credible sets are the 95% (PHENONAME.SUSIE.cred.bgz) and 99% (PHENONAME.SUSIE\_99.cred.bgz) credible sets, i.e. under the model they have a 95% or 99% probability of containing the causal variant. The files are bgzipped tab-separated values file, with one credible set per line.

Contains credible set summaries from SuSiE fine-mapping for all genome-wide significant regions.

| **Column**  | **Description**                                                                                                |
| ----------- | -------------------------------------------------------------------------------------------------------------- |
| region      | Region for which the fine-mapping was run                                                                      |
| cs          | Running number for independent credible sets in a region                                                       |
| cs\_log10bf | Log10 Bayes factor comparing the solution of this model (cs independent credible sets) to cs -1 credible sets. |
| cs\_avg\_r2 | Average correlation R2 between variants in the credible set                                                    |
| cs\_min\_r2 | Minimum R2 between variants in the credible set                                                                |
| cs\_size    | How many SNPs the credible set contains                                                                        |

#### PHENONAME.SUSIE.cred.summary.tsv, PHENONAME.SUSIE\_99.cred.summary.tsv and PHENONAME.SUSIE\_EXTEND.cred.summary.tsv

These files contain a summary of the [credible sets](https://docs.finngen.fi/background-reading/colocalization) for this phenotype. The credible sets are the 95% credible sets, i.e. under the model they have a 95% (PHENONAME.SUSIE.cred.summary.tsv) or 99% (PHENONAME.SUSIE\_99.cred.summary.tsv) probability of containing the causal variant. The file PHENOTYPE.SUSIE\_EXTEND.cred.summary.tsv contains the 95% credible set, but extended with the 99% credible set variants where possible. The files are tab-delimited with one credible set per line. The columns are described in the following table:

| **Column**         | **Description**                                                                                                                                                                                                                          |
| ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| trait              | Phenotype                                                                                                                                                                                                                                |
| region             | Region for which the fine-mapping was run                                                                                                                                                                                                |
| cs                 | Running number for independent credible sets in a region                                                                                                                                                                                 |
| cs\_log10bf        | Log10 Bayes factor comparing the solution of this model (cs independent credible sets) to cs -1 credible sets.                                                                                                                           |
| cs\_avg\_r2        | Average correlation R2 between variants in the credible set                                                                                                                                                                              |
| cs\_min\_r2        | Minimum R2 between variants in the credible set                                                                                                                                                                                          |
| low\_purity        | boolean (TRUE, FALSE) indicator if the CS is low purity (low min R2)                                                                                                                                                                     |
| cs\_size           | How many SNPs the credible set contains                                                                                                                                                                                                  |
| good\_cs           | boolean (TRUE, FALSE) indicator if this CS is considered reliable. IF this is FALSE then top variant reported for the CS will be chosen based on minimum p-value in the credible set, otherwise the top variant is chosen by maximum PIP |
| cs\_id             | Credible set ID                                                                                                                                                                                                                          |
| v                  | Top variant (chr:pos:ref:alt). The top variant is the max PIP variant if the credible set has good\_cs==TRUE, otherwise it is the min p variant.                                                                                         |
| p                  | Top variant p-value                                                                                                                                                                                                                      |
| beta               | Top variant beta                                                                                                                                                                                                                         |
| sd                 | Top variant standard deviation                                                                                                                                                                                                           |
| prob               | overall PIP of the variant in the region                                                                                                                                                                                                 |
| cs\_specific\_prob | PIP of the variant in the current credible set (this and previous are typically almost identical)                                                                                                                                        |
| 0..n               | Configured annotation columns. Typical default most\_severe, gene\_most\_severe giving consequence and gene of top variant                                                                                                               |

#### PHENONAME.SUSIE.snp.bgz and PHENONAME.SUSIE\_99.snp.bgz&#x20;

This file contains SuSIE data for all of the variants in all of the regions. The files are tab-delimited and bgzipped and has a tabix index PHENONAME.SUSIE.snp.bgz.tbi and PHENONAME.SUSIE\_99.snp.bgz.tbi. One line containts one variant. The columns are described in the table below.

| **Column** | **Description**                                                                                                                                           |
| ---------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| trait      | Phenotype                                                                                                                                                 |
| region     | Region for which the fine-mapping was run                                                                                                                 |
| v, rsid    | Variant IDs                                                                                                                                               |
| chromosome | Chromosome no.                                                                                                                                            |
| position   | Position on the chromosome                                                                                                                                |
| allele1    | Major allele                                                                                                                                              |
| allele2    | Minor allele                                                                                                                                              |
| maf        | Minor allele frequency                                                                                                                                    |
| beta       | Original marginal beta                                                                                                                                    |
| se         | Original standard error                                                                                                                                   |
| p          | Original p-value                                                                                                                                          |
| mean       | Posterior mean beta after fine-mapping                                                                                                                    |
| sd         | Posterior standard deviation after fine-mapping                                                                                                           |
| prob       | Posterior inclusion probability                                                                                                                           |
| cs         | Credible set index within region                                                                                                                          |
| lead\_r2   | R2 value for a lead variant (the one with maximum PIP) in a credible set                                                                                  |
| alphax     | Posterior inclusion probability for the xth single effect (x := 1..L where L is the number of single effects/causal variants specified; default: L = 10). |

#### PHENONAME.SUSIE.snp.filter.tsv, PHENONAME.SUSIE\_99.snp.filter.tsv and **PHENONAME.SUSIE\_extend.snp.filter.tsv**

This file contains the filtered SNPs for the 95% (PHENONAME.SUSIE.snp.filter.tsv) and 99% (PHENONAME.SUSIE\_99.snp.filter.tsv) credible sets. Variants not included in the 95% or 99% credible sets are not included in the respective files. Neither are those that were part of low\_purity credible sets. The file PHENONAME.SUSIE\_extend.snp.filter.tsv contains the filtered SNPs for the 95% credible sets, extended with 99% credible set variants where applicable, and credible sets not included in the 95%/99% credible sets are not included in this file. Variants are listed one per line and the files are tab-delimited. The columns are described in the table below:

| **Column**         | **Description**                               |
| ------------------ | --------------------------------------------- |
| trait              | Phenotype                                     |
| region             | Region for which the fine-mapping was run     |
| v                  | Variant ID (chr:pos:ref:alt)                  |
| cs                 | Running credible set ID within region         |
| cs\_specific\_prob | Posterior inclusion probability for this CS   |
| chromosome         | Chromosome no.                                |
| position           | Position on the chromosome                    |
| allele1            | Major allele                                  |
| allele2            | Minor allele                                  |
| maf                | Minor allele frequency                        |
| beta               | Original association beta                     |
| p                  | Original p-value                              |
| se                 | Original standard error                       |
| most\_severe       | Most severe consequence of the variant        |
| gene\_most\_severe | Gene corresponding to most severe consequence |

### [FINEMAP](http://www.christianbenner.com/) outputs

#### PHENONAME.FINEMAP.config.bgz

This file contains posterior summaries for all of the causal configuration, one per line. The columns are described in the following table. More information can be found at <http://www.christianbenner.com/>.

| **Column**    | **Description**                                                             |
| ------------- | --------------------------------------------------------------------------- |
| trait         | Phenotype                                                                   |
| region        | Region for which the fine-mapping was run                                   |
| rank          | Rank of this configuration within a region                                  |
| config        | Causal variants in this configuration                                       |
| prob          | Probability across all n independent signal configurations                  |
| log10bf       | Log10 Bayes factor for this configuration                                   |
| odds          | Odds for this configuration                                                 |
| k             | How many independent signals are in this configuration                      |
| prob\_norm\_k | Probability of this configuration within k independent signals solution     |
| h2            | SNP heritability of this solution                                           |
| #NAME?        | 95% confidence interval limits of SNP heritability of this solution         |
| mean          | Marginalized shrinkage estimates of the posterior effect size mean          |
| sd            | marginalized shrinkage estimates of the posterior effect standard deviation |

#### PHENONAME.FINEMAP.region.bgz

This bgzipped, tab-delimited file contains all of the finemapped regions for the endpoint, one region per line. The columns are described in the following table. More information can be found at <http://www.christianbenner.com/>.

| **Column**      | **Description**                                                         |
| --------------- | ----------------------------------------------------------------------- |
| trait           | Phenotype                                                               |
| region          | Region for which the fine-mapping was run                               |
| h2g\_snp or h2g | SNP heritability of this region                                         |
| h2g\_sd         | Standard deviation of SNP heritability of this region                   |
| h2g\_lower95    | Lower limit of 95% CI for SNP heritability                              |
| h2g\_upper95    | Upper limit of 95% CI for SNP heritability                              |
| log10bf         | Log10 Bayes factor compared against null (no signals in the region)     |
| prob\_xSNP      | x columns for probabilities of different numbers of independent signals |
| expectedvalue   | Expectation (average) of the number of signals                          |

#### PHENOTYPE.FINEMAP.snp.bgz

This tab-delimited bgzipped file contains finemapping information for each of the snps that were finemapped with one variant per line. This file also has a tabix index named **PHENOTYPE.FINEMAP.snp.bgz.tbi**. The columns of the file are described in the table below.

| **Column** | **Description**                                                             |
| ---------- | --------------------------------------------------------------------------- |
| trait      | Phenotype                                                                   |
| region     | Region for which the fine-mapping was run                                   |
| v          | Variant                                                                     |
| index      | Running index                                                               |
| rsid       | Variant ID                                                                  |
| chromosome | Chromosome no.                                                              |
| position   | Position on the chromosome                                                  |
| allele1    | Major allele                                                                |
| allele2    | Minor allele                                                                |
| maf        | Minor allele frequency                                                      |
| beta       | Original marginal beta (effect size)                                        |
| se         | Original standard error                                                     |
| z          | Original z-score                                                            |
| prob       | Posterior inclusion probability                                             |
| log10bf    | Log10 Bayes factor                                                          |
| mean       | Marginalized shrinkage estimates of the posterior effect size mean          |
| sd         | Marginalized shrinkage estimates of the posterior effect standard deviation |
| mean\_incl | Conditional estimates of the posterior effect size mean                     |
| sd\_incl   | Conditional estimates of the posterior effect size standard deviation       |
| p          | Original p-value                                                            |
| csx        | Credible set index for given number of causal variants x                    |

Read more about [Finemapping](https://docs.finngen.fi/background-reading/finemapping)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/core-analysis-results-files/finemapping-results-format.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
