# Autoreporting results format

utoreporting outputs two types of tab-separated results: **variant** and **group** reports. Group reports contain information about the groups built around [credible sets](https://docs.finngen.fi/background-reading/colocalization), with one credible set per row. They provide information about the credible set and its lead variant combined with various annotations.

The variant reports list all of the variants in the group reports' groups. These variants include credible set lead variants, as well as variants that were LD clumped together with the group lead variant. These are also combined with a set of annotations. For more information about the Autoreporting tool and how it works, see [Autoreporting in FinnGen](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/other-analyses-available/autoreporting-information-of-overlaps).

Columns can vary between releases as the autoreporting pipeline has evolved to include more information while other data has been removed due to becoming redundant. The formats described below are for FinnGen release 13 (R13).

### Variant reports

The variant report is a tab-delimited (.tsv) file summarizing the results of the endpoint’s autoreporting run. It details association statistics for all lead variants and their LD-clumped partners across every credible set. Annotations are sourced from results from the previous FinnGen release, the gnomAD database and the GWAS Catalog. The columns are as follows:

<table><thead><tr><th width="274">Column</th><th width="200">Description</th><th>Example value/Formatting</th></tr></thead><tbody><tr><td>#chrom</td><td>chromosome of variant</td><td><code>1</code></td></tr><tr><td>pos</td><td>variant position</td><td><code>123987</code></td></tr><tr><td>ref</td><td>variant reference allele</td><td><code>A</code></td></tr><tr><td>alt</td><td>variant alternate allele</td><td><code>C</code></td></tr><tr><td>pval</td><td>variant p-value</td><td><code>5.01e-7</code></td></tr><tr><td>beta</td><td>effect size of alternate allele</td><td><code>0.025026</code></td></tr><tr><td>r2_to_lead</td><td>LD r<sup>2</sup> value between this variant and the credible set's lead variant</td><td><code>0.85479</code></td></tr><tr><td>mlogp</td><td>-log10 p-value of variant</td><td><code>6.30016</code></td></tr><tr><td>sebeta</td><td>standard error of effect size</td><td><code>0.004979</code></td></tr><tr><td>af_alt</td><td>allele frequency of the alternate allele</td><td><code>0.405409</code></td></tr><tr><td>af_alt_cases</td><td>allele frequency of the alternate allele in cases</td><td><code>0.411388</code></td></tr><tr><td>af_alt_controls</td><td>allele frequency of the alternate allele in controls</td><td><code>0.403217</code></td></tr><tr><td>cs_id</td><td>credible set id</td><td><code>chr1_123456_A_C_1</code></td></tr><tr><td>cs_region</td><td>finemapping region</td><td><code>1:1-30000001</code></td></tr><tr><td>cs_number</td><td>credible set number in finemapping region</td><td><code>4</code></td></tr><tr><td>cs_prob</td><td>variant's posterior inclusion probability (PIP) representing the likelihood of it being the true causal variant in it's credible set</td><td><code>0.0629503</code></td></tr><tr><td>cs_log10bf</td><td>log10 Bayes factor for finemapping solution including this credible set vs. solution not including it</td><td><code>3.96812</code></td></tr><tr><td>cs_min_r2</td><td>minimum LD r<sup>2</sup> between variants in the credible set</td><td><code>0.46702</code></td></tr><tr><td>cs_size</td><td>credible set size</td><td><code>5</code></td></tr><tr><td>good_cs</td><td>Whether the credible set of this locus is of good quality. Currently good cs is one that has minimum LD value between credible set variants larger than 0.25.</td><td><code>True</code>/<code>False</code></td></tr><tr><td>#variant</td><td>variant ID</td><td><code>chr1_123987_A_C</code></td></tr><tr><td>locus_id</td><td>The locus in question, formatted from the top SNP's chromosome, position, reference and alternate alleles. In case of credible set grouping, the top SNP is the variant with the largest PIP in that credible set. In case of LD and simple grouping, the top SNP is the variant with smallest p-value of that group/region. Most if not all release results are grouped around credible sets.</td><td><code>chr1_1_C_T</code> for a lead variant with chromosome 1, position 1, reference allele C and alternate allele T.</td></tr><tr><td>pos_rmax</td><td>maximum basepair position of locus, including credible set and LD partners</td><td><code>30000001</code></td></tr><tr><td>pos_rmin</td><td>minimum basepair position of  locus, including credible set and LD partners</td><td><code>1</code></td></tr><tr><td>phenotype</td><td>phenotype code</td><td>-</td></tr><tr><td>longname</td><td>phenotype description</td><td>-</td></tr><tr><td>category</td><td>phenotype's ICD chapter or FinnGen endpoint category</td><td><code>II Neoplasms, from cancer register (ICD-O-3)</code></td></tr><tr><td>n_cases</td><td>number of cases for this phenotype</td><td>1234</td></tr><tr><td>n_controls</td><td>number of controls for this phenotype</td><td>1234</td></tr><tr><td>enrichment_nfsee</td><td>same as GENOME_FI_enrichment_nfe_est column</td><td><code>0.888291</code></td></tr><tr><td>fin.AF</td><td>variant Finnish-ancestry allele frequency from gnomAD genome-seq data</td><td><code>0.123</code></td></tr><tr><td>fin.AN</td><td>called allele count (either ref or alt) for variant in Finnish-ancestry samples, sourced from gnomAD genome-seq data</td><td><code>1234</code></td></tr><tr><td>fin.AC</td><td>variant Finnish-ancestry allele count from gnomAD genome-seq data</td><td><code>123</code></td></tr><tr><td>fin.homozygote_count</td><td>number of homozygote carriers in Finnish-ancestry samples from gnomAD genome-seq data</td><td><code>12</code></td></tr><tr><td>fet_nfsee.odds_ratio</td><td>odds ratio of Fischer's exact test for enrichment of alt allele in Finnish-ancestry vs. non-Finnish, Swedish or Estonian Europeans, based on gnomAD genome-seq data</td><td><code>1.15</code></td></tr><tr><td>fet_nfsee.p_value</td><td>p-value of Fischer's exact test for enrichment of alt allele in Finnish-ancestry vs. non-Finnish, Swedish or Estonian Europeans, based on gnomAD genome-seq data</td><td><code>5.01e-3</code></td></tr><tr><td>nfsee.AC</td><td>variant non-Finnish, Swedish or Estonian European allele count from gnomAD genome-seq data</td><td><code>123</code></td></tr><tr><td>nfsee.AN</td><td>called allele count (either ref or alt) for variant in non-Finnish, Swedish or Estonian Europeans, sourced from gnomAD genome-seq data</td><td><code>1234</code></td></tr><tr><td>nfsee.AF</td><td>variant non-Finnish, Swedish or Estonian Europeans allele frequency from gnomAD genome-seq data</td><td><code>0.123</code></td></tr><tr><td>nfsee.homozygote_count</td><td>number of homozygote carriers in non-Finnish, Swedish or Estonian Europeans from gnomAD genome-seq data</td><td><code>123</code></td></tr><tr><td>most_severe_gene*</td><td>most severe gene of the variant</td><td><code>APOE</code></td></tr><tr><td>most_severe_consequence</td><td>most severe consequence of variant</td><td><code>missense_variant</code></td></tr><tr><td>FG_INFO</td><td>variant imputation INFO score in this FinnGen release</td><td><code>0.9951358663750908</code></td></tr><tr><td>n_INFO_gt_0_6</td><td></td><td></td></tr><tr><td>functional_category</td><td>variant functional category from gnomAD genome-seq data</td><td><code>pLoF</code></td></tr><tr><td>rsids</td><td>rsids of this variant</td><td><code>rs1234</code></td></tr><tr><td>GNOMAD_AF_fin</td><td>Finnish allele frequency from gnomAD 4.1</td><td>0.45</td></tr><tr><td>GNOMAD_AF_nfe</td><td>non-Finnish European allele frequency in gnomAD 4.1</td><td>0.45</td></tr><tr><td>GNOMAD_FI_enrichment_nfe</td><td>Finnish enrichment of the variant compared to non-Finnish European population in gnomAD 4.1 </td><td>1.2</td></tr><tr><td>pval_previous_release</td><td>variant p-value for same endpoint in previous FinnGen release</td><td><code>3.55901e-11</code></td></tr><tr><td>beta_previous_release</td><td>variant effect size of alternate allele for same endpoint in previous FinnGen release</td><td><code>0.042283</code></td></tr><tr><td>#variant_hit</td><td>same as #variant column if variant has been previously reported as a GWAS hit</td><td><code>chr1_123987_A_C</code></td></tr><tr><td>pval_trait</td><td>p-value of variant for previously reported GWAS hit</td><td><code>6e-32</code></td></tr><tr><td>trait</td><td><a href="https://www.ebi.ac.uk/ols4/ontologies/efo">EFO ID</a> of trait for which variant has been previously reported as a GWAS hit</td><td><code>EFO_0005711</code></td></tr><tr><td>trait_name</td><td>name/description of trait for which variant has been previously reported as a GWAS hit</td><td><code>household income</code></td></tr><tr><td>study_link</td><td>pubmed link to study where variant has been previously reported as a GWAS hit (if available)</td><td>www.ncbi.nlm.nih.gov/pubmed/01234567</td></tr></tbody></table>

### Group reports

The group report is a tab-delimited (.tsv) file and contains the group-level ([**credible set**](https://docs.finngen.fi/background-reading/colocalization)) summary of an autoreporting run. They are aggregated from the variant reports, with one credible set per row. Annotations are sourced from results from the previous FinnGen release, the gnomAD database and the GWAS Catalog. The columns are as follows:

<table><thead><tr><th>Column</th><th width="200">Description</th><th>Example value/Formatting</th></tr></thead><tbody><tr><td>phenotype</td><td>phenotype name</td><td>-</td></tr><tr><td>phenotype_abbreviation</td><td>phenotype code</td><td>-</td></tr><tr><td>locus_id</td><td>The locus in question, formatted from the top SNP's chromosome, position, reference and alternate alleles. In case of credible set grouping, the top SNP is the variant with the largest PIP in that credible set. In case of LD and simple grouping, the top SNP is the variant with smallest p-value of that group/region. Most if not all release results are grouped around credible sets.</td><td><code>chr1_1_C_T</code> for a lead variant with chromosome 1, position 1, reference allele C and alternate allele T.</td></tr><tr><td>rsids</td><td>rsids of this variant</td><td>rs1234</td></tr><tr><td>Cases</td><td>Number of cases for this phenotype</td><td>1234</td></tr><tr><td>Controls</td><td>Number of controls for this phenotype</td><td>1234</td></tr><tr><td>chrom</td><td>chromosome of locus</td><td><code>1</code></td></tr><tr><td>pos</td><td>lead variant position</td><td><code>123456</code></td></tr><tr><td>ref</td><td>lead variant reference allele</td><td><code>A</code></td></tr><tr><td>alt</td><td>lead variant alternate allele</td><td><code>C</code></td></tr><tr><td>pval</td><td>lead variant p-value</td><td><code>5.01e-7</code></td></tr><tr><td>lead_r2_threshold</td><td>minimum LD r<sup>2</sup> with lead variant for inclusion in this locus, calculated based on a minimum expected <span class="math">\chi^2</span> statistic of 5 - see <a href="https://docs.finngen.fi/faq/about-finngen-data/what-is-the-difference-is-between-ld-clumping-and-the-saige-conditional-analysis">lower half of this page</a> for more detail.</td><td>0.13822</td></tr><tr><td>lead_beta_previous_release</td><td>effect size of alternate allele for same endpoint in previous FinnGen release</td><td>0.042283</td></tr><tr><td>lead_pval_previous_release</td><td>p-value for same endpoint in previous FinnGen release</td><td>3.55901e-11</td></tr><tr><td>lead_most_severe_consequence</td><td>most severe consequence of lead variant</td><td><code>missense_variant</code></td></tr><tr><td>lead_most_severe_gene*</td><td>most severe gene of the lead variant</td><td><code>APOE</code></td></tr><tr><td>lead_enrichment</td><td>How much the lead variant is enriched in Finnish population compared to Europe</td><td><code>4.35</code></td></tr><tr><td>lead_$COLUMN_NAME</td><td>other columns that are grabbed for the lead variant, such as effect size, standard error, p-value and allele frequencies</td><td>-</td></tr><tr><td>gnomAD_functional_category</td><td>functional category for the variant Exome data.</td><td><code>pLoF</code></td></tr><tr><td>gnomAD_enrichment_nfsee</td><td>lead variant enrichment in Finland against NFSEE (Europeans that are not Finnish, Swedish or Estonian) population from Exome data.</td><td><code>5.1</code></td></tr><tr><td>gnomAD_fin.AF</td><td>lead variant allele frequency in Finland. Exome data.</td><td><code>0.123</code></td></tr><tr><td>gnomAD_fin.AN</td><td>lead variant allele number in Finland. Exome data.</td><td><code>123</code></td></tr><tr><td>gnomAD_fin.AC</td><td>lead variant allele count in Finland. Exome data.</td><td><code>123</code></td></tr><tr><td>gnomAD_fin.homozygote_count</td><td>Amount of homozygote carriers in Finnish population. Exome data.</td><td><code>12</code></td></tr><tr><td>gnomAD_fet_nfsee.odds_ratio</td><td>Fischer's exact test for enrichment FIN vs. NFSEE odds ratio. Exome data.</td><td><code>1.15</code></td></tr><tr><td>gnomAD_fet_nfsee.p_value</td><td>Fischer's exact test for enrichment FIN vs. NFSEE p-value. Exome data.</td><td><code>5.01e-3</code></td></tr><tr><td>gnomAD_nfsee.AC</td><td>lead variant NFSEE population allele count. Exome data.</td><td><code>123</code></td></tr><tr><td>gnomAD_nfsee.AN</td><td>lead variant NFSEE population allele number. Exome data.</td><td><code>123</code></td></tr><tr><td>gnomAD_nfsee.AF</td><td>lead variant NFSEE population allele frequency. Exome data.</td><td><code>0.123</code></td></tr><tr><td>gnomAD_nfsee.homozygote_count</td><td>Amount of homozygote carriers in NFSEE population. Exome data.</td><td><code>123</code></td></tr><tr><td>cs_id</td><td>credible set id</td><td><code>chr1_123456_A_C_1</code></td></tr><tr><td>cs_size</td><td>credible set size</td><td><code>5</code></td></tr><tr><td>cs_log_bayes_factor</td><td>credible set bayes factor, log10</td><td><code>5.21</code></td></tr><tr><td>cs_number</td><td>credible set number in its region</td><td><code>1</code></td></tr><tr><td>cs_region</td><td>finemapping region</td><td><code>1:1-30000001</code></td></tr><tr><td>good_cs</td><td>Whether the credible set of this locus is of good quality. Currently good cs is one that has minimum LD value between credible set variants larger than 0.25.</td><td><code>True</code>/<code>False</code></td></tr><tr><td>credible_set_min_r2_value</td><td>Minimum LD r<sup>2</sup> value between credible set variants</td><td><code>0.4</code></td></tr><tr><td>best_coding_var</td><td>The variant in credible set that has a functional consequence, and has the largest PIP.</td><td><code>chr1_1_A_T</code></td></tr><tr><td>best_coding_var_consequence</td><td>functional consequence for the best coding variant</td><td><code>missense_variant</code></td></tr><tr><td>best_coding_var_gene</td><td>gene in which the best coding variant has the consequence</td><td><code>GENE1</code></td></tr><tr><td>best_coding_var_af</td><td>Finnish allele frequency of the best coding variant</td><td><code>0.30685</code></td></tr><tr><td>best_coding_var_eur_af</td><td>non-finnish allele frequency of the best coding variant. Taken from gnomAD annotation.</td><td><code>0.32667</code></td></tr><tr><td>best_coding_var_beta</td><td>effect size of the best coding variant</td><td><code>0.059923</code></td></tr><tr><td>best_coding_var_p</td><td>p-value of best coding variant</td><td><code>4.16296e-5</code></td></tr><tr><td>start</td><td>locus start position in basepairs</td><td><code>1</code> for a group with positions [1,2,3,4,5]</td></tr><tr><td>end</td><td>locus end position in basepairs</td><td><code>5</code> for a group with positions [1,2,3,4,5]</td></tr><tr><td>found_associations_strict</td><td>This column lists all of the trait associations found in GWAS Catalog for variants that are in the credible set/strict group (strict group here means that in case of LD grouping, variants that are in higher LD than a given threshold, and have p-values lower than the significance threshold). The trait name is followed by the LD r² that reported variant had with the lead variant. If there are multiple variants associated with that trait, the largest value is chosen.</td><td><code>trait1|0.86;trait2|0.45</code></td></tr><tr><td>found_associations_relaxed</td><td>This column lists all of the trait associations found in GWAS Catalog for variants in the group. The trait name is followed by the LD r² to lead value of the variant that had the association. If there are multiple variants associated with that trait, the largest value is chosen.</td><td><code>trait1|0.86;trait2|0.45</code></td></tr><tr><td>credible_set_variants</td><td>This column lists the credible set variants. The PIP and R² values are listed after the variant</td><td><code>chr1_1_A_T|0.25|0.999</code></td></tr><tr><td>functional_variants_strict</td><td>All of the variants with a functional consequence, with the functional consequence label, gene and R² to lead variant. The variants are part of the credible set/strict group.</td><td><code>chr1_1_A_T|missense_variant|GENE1|0.45</code></td></tr><tr><td>functional_variants_relaxed</td><td>All of the variants with a functional consequence, with the functional consequence label and R² to lead variant. The variants are part of the credible set/strict group.</td><td>chr1_1_A_T|missense_variant|GENE1|0.45</td></tr><tr><td>specific_efo_trait_associations_strict</td><td>If specific traits were given to the script(e.g. equivalent EFO codes to the phenotype in question), any trait associations correspoding to those traits are listed here. This column lists only associations where the variant is in the credible set/strict group.</td><td>trait1|0.86;trait2|0.45</td></tr><tr><td>specific_efo_trait_associations_relaxed</td><td>If specific traits were given to the script(e.g. equivalent EFO codes to the phenotype in question), any trait associations correspoding to those traits are listed here. This column lists associations to all variants in the group.</td><td>trait1|0.86;trait2|0.45</td></tr><tr><td>n_ld_partners_0_8</td><td>number of nearby variants with LD r<sup>2</sup><span class="math">\ge</span>0.8 with lead variant</td><td><code>5</code></td></tr><tr><td>n_ld_partners_0_6</td><td>number of nearby variants with LD r<sup>2</sup><span class="math">\ge</span>0.6 with lead variant</td><td><code>23</code></td></tr></tbody></table>

\*The HGNC symbols in the autoreporting files are version 38, as well as the VEP cache. "*The most\_severe\_gene"* and *"most\_severe"* columns in the autoreporting file come from the finngen annotation file. Analysis team will update the release documentation to include this information (including those versions) for future releases.

For more information, see the release documentation:

`/library-green/finngen_R8_analysis_documentation/`

Read more about [Autoreporting in FinnGen](https://docs.finngen.fi/finngen-data-specifics/green-library-data-aggregate-data/other-analyses-available/autoreporting-information-of-overlaps), and see also FAQ [Do the autoreports report the 95% or 99% credible sets](https://docs.finngen.fi/faq/about-pheweb/do-the-autoreports-report-the-95-or-99-credible-set)
