Modifiable Finemapping pipeline
How to manually run finemapping with SuSiE and FINEMAP with the provided workflows
Finemapping
As GWAS summary statistics do not provide information about which variants are the causal variants in any given region, this pipeline can be used to help you identify both the likely causal variant(s) and credible sets that are 95% and 99% likely to contain the causal variant. For more information about finemapping in general, see Finemapping and for information about finemapping results files, see Finemapping results format.
We recommend the more user-friendly unmodifiable finemapping pipeline to run custom GWAS finemapping in release 12 (R12) or later.
Example finemapping workflow files
We provide example Cromwell workflow files which can be found in sandbox in the folder /finngen/library-green/scripts/finemap/
:
.json
(input parameter) files:finemap_inputs.json
(for R7),finemap_DF11.json
(for R11),finemap_DF11_custom_bed.json
(for R11 using user-defined regions to finemap), andfinemap_DF12.json
(for R12)
.wdl
file:finemap.wdl
sub-wdl
file:finemap_sub.wdl.zip
These example scripts are for running the finemapping pipeline for two endpoints, I9_CORATHER
and I9_MI_STRICT
in FinnGEN R7 (finemap_inputs.json
),in R11 (finemap_DF11.json
) and R12 (finemap_DF12.json
).
Once you have copied the example scripts, you need to edit finemap_inputs.json
according to your needs. The parts you (may) need to edit are:
finemap.sumstats_pattern
: Path to a summary statistics file(s). Replace phenotype names in the file names with{PHENO}
. Example summary statistics files for R7:gs://finngen-production-library-green/finngen_R7/finngen_R7_analysis_data/summary_stats/release/finngen_R7_{PHENO}.gz
,finemap.phenolistfile
: Path to a .txt file containing list of phenotypes to run, one phenotype on each row.finemap.bed_regions_file
: Path to a .txt file containing list of bedfiles that give the regions to finemap. The bed files for each pheno should be given in the same order as in the phenolistfile. Use integers for chromosome names 1-23. Example for DF11 can be found in/finngen/library-green/scripts/finemap/finemap_DF11_custom_bed.json
. This input variable should only be used if you wish to define your own custom finemapping regions. Omit in case default region selection is required.finemap.phenotypes
: Path to a phenotype-covariate file. Example for DF7 can be found at:/finngen/library-red/finngen_R7/phenotype_4.0/data/finngen_R7_cov_pheno_1.0.txt.gz
finemap.ldstore_finemap.ldstore.sample
: Path to a sample file corresponding to your bgen- file(s).finemap.ldstore_finemap.ldstore.bgen_pattern
: Path to a bgen files. Replace {chrom} with{CHR}
. An example for full R7:/finngen/library-red/finngen_R7/bgen_2.0/data/finngen_R7_{CHR}.bgen
finemap.ldstore_finemap.filter_and_summarize.snp_annot_file
: Path to a variant annotation file. For R7, it can be found at:/finngen/library-green/finngen_R7/finngen_R7_analysis_data/annotations/R7_annotated_variants_v1.gz
finemap.ldstore_finemap.filter_and_summarize.snp_annot_file_tbi
: Path to a index file for variant annotation file. For R7, it can be found at:/finngen/library-green/finngen_R7/finngen_R7_analysis_data/annotations/R7_annotated_variants_v1.gz.tbi
For the next one's, make sure they correspond to your summary statistics file, these examples are for the released R7 summary statistics:
"finemap.preprocess.rsid_col"
:""
,"finemap.preprocess.chromosome_col"
:"#chrom"
,"finemap.preprocess.position_col"
:"pos"
,"
finemap.preprocess.allele1_col"
:"ref"
,"finemap.preprocess.allele2_col"
:"alt"
,"finemap.preprocess.freq_col"
:"af_alt"
,"finemap.preprocess.beta_col"
:"beta"
,"finemap.preprocess.se_col"
:"sebeta"
,"finemap.preprocess.p_col"
:"pval"
,"finemap.preprocess.delimiter":
"TAB"
,
Submitting the workflow
Once you have edited the .json file for your specific analysis, you can submit your job to pipelines via command line with the command
finngen-cli request-workflow --wdl /path/to/finemap.wdl \
--input /path/to/finemap_inputs.json \
--dependencies /path/to/finemap_sub.wdl.zip
Note: Remember to save the [WORKFLOW_ID]
of your job for later monitoring and checking for the results! See also tips on how to find a pipeline job ID.
Results files
Note: If you are unfamiliar with the finemapping pipeline results, the formats of all output files are described on the Finemapping results format page.
When your job is successfully completed, you can find your FINEMAP results in: /finngen/pipelines/cromwell/workflows/finemap/[WORKFLOW_ID]/call-ldstore_finemap/shard-0/sub.ldstore_finemap/[sub_workflow_id]/call-finemap/shard-[N]/
where each phenotype you submit is given a unique sub_workflow_id
and each finemapped locus of that phenotype is given its own shard subfolder (so [N]
goes from 0 to n-1 for n loci).
From there, you can find for example:
.snp(.bgz)
files, in which are the results, such as the probability of being causalprob
for each variant in the region.log_sss
file, in which you can see the posterior probabilities for the credible sets in the regionglob-*
subfolders, within which you can find your.cred*
- files. From these, you can get your credible sets, as well some additional information on the credible set, such as posterior probability and LD statistics among the variants in the set.
Results from SuSiE can be found at: /finngen/pipelines/cromwell/workflows/finemap/[WORKFLOW_ID]/call-ldstrore_finemap/shard-0/sub.ldstore_finemap/[sub_workflow_id]/call-susie/shard-#/
From there, in the .snp
- file you can find the probabilities for being causal (prob
), as well as the information on which variants are included in the credible set(s) (cs
). (-1 represents as not included in any of the credible set)
Last updated
Was this helpful?