GWAS Meta-analysis

What is a GWAS meta-analysis?

As explained on the GWAS Analysis page, a Genome-Wide Association Study (GWAS) identifies genetic variants associated with a phenotype within a single study population (like the FinnGen study cohort). Even in very large cohorts, however, there may not be enough statistical power to identify variant associations that have small effect sizes or for rare variants. This is where GWAS meta-analysis can be useful.

A GWAS meta-analysis is a statistical technique that combines the results from multiple independent GWAS that have tested the genetic associations of the same phenotype or trait in different cohorts. Rather than combining the raw, individual-level phenotype and genotype data, the resulting summary statistics (primarily the effect size β and its standard error) from each study can be combined for each overlapping genetic variant.

The goal is to increase the total sample size significantly, thereby boosting the statistical power to:

Detect novel associations that were too weak to reach genome-wide significance in any single study.
More precisely estimate the effect sizes ( $\beta$ ) of known variants.

How does it work?

The core concept of a GWAS meta-analysis is, for each genetic variant, to calculate a weighted average of the effect sizes ( $\beta$ ) across all contributing studies. The weighting is crucial:

Studies with more precise estimates (i.e., those with smaller standard errors, often due to a larger sample size) are given more weight in the final combined result.
The final output is a new set of summary statistics (combined $\beta$ , $\text{SE}$ , and P-value) for every tested variant, representing the evidence across all studies combined.

Extending the GWAS model

Simple GWAS linear regression model (quantitative phenotypes)

In the simple linear additive GWAS model described previously, we are trying to fitting a "line of best fit" between the phenotype and genotype data. In mathematics terms, we are trying to find the parameters $\mu$ and $\beta$ that minimize the sum of squared errors $||\mathbf{\epsilon}||^2$ for the regression equation $\mathbf{y} = \mu + \mathbf{x}\beta + \mathbf{\epsilon}$ , where:

$\mathbf{y}$ is a vector of the individual phenotypes $(y_1,y_2,...,y_M)$ across all $M$ samples
$\mathbf{x}$ is a vector of the individual genotypes $(x_1,x_2,...,x_M)$ across all $M$ samples, each coded as 0, 1, or 2 which represents the number of copies of the alternate allele an individual has
$\mu$ is the regression intercept (mean value of individuals without the variant, i.e., homozygous for reference allele - those with genotype $x$ as 0)
$\beta$ is the regression coefficient (effect size) which captures the average (linear) increase in the phenotype for each copy of the alternate allele
$\mathbf{\epsilon}$ is the vector of error terms, representing deviation away from the line of best fit for each individual, which is hopefully normally distributed.

In recessive GWAS models, the same process is applied but genotypes $\mathbf{x}$ are first recoded so that the standard (0, 1, 2) are now coded as (0, 0, 1), so that $\beta$ represents the effect on the phenotype of being homozygous for the alternate allele versus not being homozygous for the alternate allele. Similarly, for dominant GWAS models, the genotypes $\mathbf{x}$ are recoded from the standard (0, 1, 2) to (0, 1, 1) meaning that the GWAS $\beta$ represents the effect on the phenotype of carrying at least one alternate allele versus carrying no alternate alleles.

Simple GWAS logistic regression model (binary phenotypes)

For binary phenotypes and traits, the model and interpretation of the parameters is a little different, because we are no longer estimating the increase in a trait (per copy of alternate allele), but instead estimating the increase in odds of being a case. The simple logistic GWAS model can be specified as

$\log \left(\frac{\text{Pr}(Y = 1 | X = x)}{\text{Pr}(Y=0 | X=x)}\right)= \mu + \mathbf{x}\beta + \mathbf{\epsilon}$

where $\log$ is the natural logarithm function, and $\text{Pr}(Y=1|X=x)$ and $\text{Pr}(Y=0|X=x)$ represent the respective probabilities that an individual is a case ( $Y=1$ ) or control $(Y=0)$ , given a specific genotype $x$ . The parameter $\mu$ now represents the logarithm of odds ("log-odds") of being a case when carrying no alternate alleles (i.e., genotype $x=0$ ) and $\beta$ represents the increase in log-odds of being a case for each copy of the alternate allele.

As an example, if we run a GWAS for a disease, and for a specific variant we find a statistically significant (log-odds) estimate of $\beta=0.1$ , we can calculate the odds ratio as $e^{\beta} = e^{0.1} \approx 1.11$ which can be interpreted as each copy of the alternate allele increasing the odds of having the disease by approximately 11%.

Combining estimates from multiple studies

The model used in a single GWAS yields an effect estimate $\beta_i$ for a study $i$ . In a meta-analysis, we combine study-specific effect estimates for $N$ studies $(\beta_1,\beta_2,...,\beta_N)$ into a single overall effect estimate $\beta_{meta}$ and standard error estimate $\text{SE}_{meta}$ using

$\beta_{meta} = \frac{\sum_{i=1}^{N} w_i \beta_i}{\sum_{i=1}^{N} w_i}$ $\text{SE}_{meta} = \sqrt{\frac{1}{\sum_{i=1}^{N} w_i}}$

where $w_i$ is the weight (explained below) assigned to study $i$ for that variant. In simple terms, the meta-analysis effect size for a variant is calculated as the sum of each study's effect size (for that variant) multiplied by that study's weight for that variant, divided by the sum of that variant's weights across all included studies.

In a similar way to the individual GWAS effect estimates, P-values are found by first calculating the variant's $Z$ score as $Z_{meta} = \frac{\beta_{meta}}{\text{SE}_{meta}}$ which follows a standard normal distribution. A $Z$ test is then performed, which results in a P-value that represents the probability that we would see such an effect estimate (and standard error) by chance, given that the null hypothesis is true of no real phenotype-variant association. Typical genome-wide significance thresholds of P<5x10^-8 (or stricter) can then be applied to find the statistically significant variants. For more information on interpretation of P-values, see the P-values page.

Fixed-effects versus random effects models

There are various choices for how the weights $w_i$ are calculated, which will have an effect on contribution of each study's effect size estimate to the overall meta-analysis estimate. One important decision that affects the weighting is the choice between using a fixed-effects or random-effects model when using weights that incorporate the variance of a study's effect size estimate (e.g., weights that are calculated using $\text{SE}$ ).

The fixed-effects model assumes there is a single, common true effect size that underlies all studies (i.e., between-study variance, $\tau^2$ , is negligible). The model assumes that any variation, across the multiple studies, in a variant's effect estimate is due only to random or chance error within each study.

The random-effects model assumes that the true effect size varies from study to study (i.e., $\tau^2$ is non-zero) and Instead of a single true effect, there is a distribution of true effects. The model asserts that any observed variation of a variant's effect across different studies is due to both random error and statistical heterogeneity (systematic differences) between studies, such as phenotype definition, GWAS model, ancestry, etc. If there is no between-study heterogeneity, fixed- and random-effects models should provide the same results.

A random-effects model is generally more appropriate for GWAS meta-analysis because the (fixed-effects model) assumption that every study is estimating the same underlying true effect despite potential systematic differences between the studies rarely fulfilled. However, random-effects models are generally less statistically powerful than fixed-effects models.

Weighting scheme

When performing meta-analyses, weighting the effect estimates allows individual study estimates to influence the overall estimate based on their precision; studies with more precise estimates have a larger effect on the meta-analysis estimate.The most common weighting schemes for GWAS meta-analysis are inverse-variance weighting and sample size-based weighting.

Inverse-variance weighting uses the inverse of the effect estimate variance, with weights $w_i$ calculated as:

Fixed-effect: $w_i = \text{SE}_i^{-2} = \frac{1}{\text{SE}_i^2}$ Random-effect: $w_i = (\text{SE}_i^2 + \hat{\tau}^2)^{-1} = \frac{1}{\text{SE}_i^2 + \hat{\tau}^2}$

where $\text{SE}_i$ is that variant's effect size standard error for study $i$ and $\hat{\tau}^2$ is the estimated between-study variance. This method ensures that the studies that are more certain of their effect estimate $\beta$ (smaller $\text{SE}$ ) have a larger influence on the combined estimate.

Sample size-based weighting is much simpler to implement and can be used in cases where the effect size estimate variance ( $\text{SE}^2$ ) is not provided, with weights set as the study's sample size $w_i = N_i$ . This weighting scheme, however, makes the assumption that larger sample sizes will lead to more precise results and may give an outsized influence to larger studies on the overall effect estimate, regardless of actual precision.

Other weighting schemes existing such as imputation-quality weighting, where a variant's effect size is weighted by its imputation quality in each study so that effect sizes of better-imputed variants have a stronger influence, or allele-frequency based weighting (sometimes used in rare variant analyses) which can give more weight to estimates from studies where the variant is rarer.

Heterogeneity statistics

In addition to calculating a combined effect estimates, GWAS meta-analyses typically also provide heterogeneity statistics, which indicate how much a variants' effect size estimates vary between studies. A common choice is the Cochran's $Q$ statistic, which is calculated (for each variant) as

$Q = \sum_{i=1}^N w_i (\beta_i - \beta_{meta})^2$

where $w_i$ and $\beta_i$ are the variant's weight and effect estimate, respective, from study $i$ and $\beta_{meta}$ is the variant's effect estimate from the meta-analysis. In simple terms, the squared difference between each study's effect estimate and the meta-analysis is weighted and summed across all studies; the higher the $Q$ for a variant, the more variable (heterogeneous) that variant's effects are across the included studies.

The $Q$ statistic follows a $\chi^2$ distribution with $N-1$ degrees of freedom, where $N$ is the number of studies, and the associated P-value represents the probability that a $Q$ statistic of that size or larger would be seen by chance, given that the null hypothesis that there is no heterogeneity between different studies' effect size estimates.

The statistical power of Cochran's $Q$ can be limited as it depends on the number of studies, so some software also calculate an alternative $I^2$ statistic, calculated as

$I^2 = \frac{Q - (N-1)}{Q} \times 100\%$

The $I^2$ statistic attempts to capture the percentage of the total variation of a variant's effect size across studies that is due to true heterogeneity (differences inherent in the studies) rather than random variation. Values of $I^2$ can loosely be interpreted as low (0-30%), moderate (30-50%), substantial (50-80%) and considerable heterogeneity (80-100%).

GWAS effect size estimates will naturally vary across different studies due to many factors, including differences in GWAS models, population ancestries and structures, cohort ascertainment biases, phenotype definitions, sample sizes and the precision of effect estimates themselves. The $Q$ statistic (and its P-value) and the $I^2$ statistic are important, as they provide an indication of whether the heterogeneity is higher than expected and so whether a particular variant's effect estimates warrant further investigation to find the source of the heterogeneity.

Key considerations

Meta-analysis relies on the assumption that the studies being combined are homogeneous enough for the results to be comparable.

Phenotype definition: The phenotype (e.g., Type 2 Diabetes) must be defined and measured consistently across all studies. Differences in case/control ascertainment can introduce heterogeneity (differences in effect estimates not due to chance).
Ancestry: Combining cohorts of different genetic ancestries is common and necessary for generalizability, but it may introduce heterogeneity. Advanced methods can be used to account for this.
Statistical models: Most modern GWAS implement more complex linear mixed models (LMMs) than the simple linear regression above, which allows them to better correct for population structure and relatedness and reduce the number of false-positive associations. The above-described meta-analysis approaches are still applicable to effect size estimates produced by these more complex models, but care must be taken to ensure that the effect size $\beta$ in each contributing study is estimated using a comparable statistical model.
Heterogeneity of effect: Most GWAS meta-analysis software will also provide a heterogeneity statistic and P-value for each variant tested. For genetic variants identified as statistically significant in a meta-analysis, the statistical significance of the variant's heterogeneity statistic should also be considered with those that are significant warranting further investigation.
Weighting approach: The appropriate weighting methods (such as inverse-variance weighting) and model (random-effects versus fixed effects) should be selected based on the available data and summary statistics, as well as tests of the underlying model assumptions through assessing effect estimate heterogeneity.

FinnGen meta-analyses

Starting from data release 12, the FinnGen core team has been performing and releasing results of GWAS meta-analyses of FinnGen, UK Biobank (UKBB) and the Million Veterans Program (MVP) for phenotypes that can be matched between the cohorts. Both two-way FinnGen-UKBB and three-way FinnGen-UKBB-MVP meta-analyses are performed for each release.

Meta-analyses are performed using FinnGen's own in-house meta-analysis workflow, which is designed for Google's cloud-based computing environment, using a fixed-effects inverse-variance weighting scheme. More details on quality control, phenotype matching between studies and locations of results files can be found at the Meta-analyses page. Results can also be viewed at the relevant PheWeb browsers: FinnGen-UKBB and FinnGen-UKBB-MVP.

GWAS meta-analysis software

There are a number of software packages available to perform meta-analysis. The most common include:

METAL - old but popular, quick and easy to install and use, limited to fixed effects models but can perform inverse-variance and sample-size weighting, efficient for large datasets but limited features, random effects forked version also available
GWAMA - can run fixed and random effects models, with additional tools for QC and visualisation of results
PLINK - less commonly used for meta-analysis but is easy to use, can run fixed and random effects models and has good QC features implemented
mtag - typically used for jointly analyzing multiple traits, but can also perform (non trans-ancestry) meta-analysis, the python-based mama extends the mtag framework to handle trans-ancestry meta-analyses
rmeta R package - convenient if meta-analysis within the R environment is preferred
FINNGEN/META_ANALYSIS - FinnGen's own meta-analysis pipeline for use in the Google Cloud environment

Additional reading

Genome-wide Association Studies, Uffelmann et al. (2021), Nature Reviews Methods Primers
Meta-Analysis in Genome-Wide Association Studies, Zeggini & Ioannidis (2009), Pharmacogenomics
Random-Effects Model Aimed at Discovering Associations in Meta-Analysis of Genome-wide Association Studies, Han & Eskin (2011), AJHG
Material from Matti Pirinen's GWAS course, part of the Life Science Informatics MSc programme at the University of Helsinki:

PreviousP Values NextHeritability and genetic correlations

Last updated 1 month ago

Was this helpful?