Skip to content
Snippets Groups Projects
Vincent  LAVILLE's avatar
Vincent LAVILLE authored
Fixing bugs and updating URL

See merge request statistical-genetics/VarExp!1
cf510282
History

VarExp

The R package VarExp provides functions for the estimating of the percentage of phenotypic variance explained by genetic effects, interaction effects or jointly by both effects. This suite of functions are useful for meta-analysis designs where pooling individual genotype data is challenging. A pre-print article related to this work is available here

Prerequisite

Library Rcurl is required to run VarExp.

Installation

For now, VarExp can be installed only using package source. In R, after setting your working directory to VarExp_0.1.0.tar.gz location, type:

# From binaries
install.packages("VarExp_0.1.0.tar.gz", repos = NULL, type = "source")

# From the GitLab repository
devtools::install_git("https://gitlab.pasteur.fr/statistical-genetics/VarExp.git")

Input format

Two input files are required.

  • A file providing the meta-analysis results with the following mandatory columns:
    • the rs identifier of the variant
    • the chromosome number on which the variant is
    • the physical position of the variant (currently in NCBI Build B37)
    • the tested allele of the variant
    • the frequency of the allele A0
    • the regression coefficient of the main genetic effect
    • the regression coefficient of the interaction effect
##        RSID CHR     POS A0 FREQ_A0  MAIN_EFFECT  INT_EFFECT
##  rs72900467   1  989500  A 0.05558 -0.282895628  0.11487230
##  rs34372380   1 1305201  T 0.11205 -0.003162676  0.01704444
##   rs4422949   1  834928  G 0.21753 -0.133573045 -0.11129018
##   rs9442366   1 1009234  T 0.42201  0.121852094 -0.09421119
##  rs61768199   1  781845  G 0.09736 -0.017142010  0.02977832
##   rs9439462   1 1462766  T 0.04784  0.206595425  0.06823945
##    rs307370   1 1273278  A 0.16546  0.052140346 -0.01852352
##  rs11807848   1 1061166  C 0.39556  0.169484484  0.03845663
##   rs7538305   1  824398  C 0.15379  0.054950590 -0.04494799
##  rs28613513   1 1112810  T 0.05358 -0.001334013  0.10294423
  • A file providing the summary statistics for the outcome and the exposure in each individual cohort included in the meta-analysis. Mandatory columns of this file are:
    • the identifier of the cohort
    • the sample size of the cohort
    • the phenotype mean in the cohort
    • the standard deviation of the phenotype in the cohort
    • the exposure mean in the cohort
    • the standard deviation of the exposure in the cohort
##  Cohort PHENO_N PHENO_Mean PHENO_SD EXPO_Mean  EXPO_SD
##       1   10000   1.297265 3.097524  2.002715 1.250979
##       2   10000   1.288332 3.152367  2.009427 1.242574
##       3   10000   1.390218 3.109720  1.995473 1.258670
##       4   10000   1.342020 3.151429  1.999943 1.256718
##       5   10000   1.385564 3.153274  2.002401 1.235129

Note that in the case of a binary exposure, the two latter columns can be replaced by a single column providing the count of exposed individuals in each cohort.

Short tutorial

Data used in this tutorial are included in the VarExp package.

# Load the package
library(VarExp)

# Load the meta-analysis summary statistics file
data(GWAS)

# Load the cohort description file
data(COHORT)

# Compute the genotype correlation matrix from the reference panel
C <- getGenoCorMatrix(GWAS$RSID, GWAS$CHR, GWAS$POS, GWAS$A0, "EUR", pruning = FALSE)

# Make sure SNPs in the GWAS data and in the correlation matrix match
# Necessary if pruning = TRUE, otherwise should have no effect
GWAS <- checkInput(GWAS, colnames(C))

# Retrieve mean and variance of the exposure and the phenotype
# from individual cohort summary statistics
parsY <- calculateParamsFromIndParams(COHORT$PHENO_N, COHORT$PHENO_Mean, COHORT$PHENO_SD)
parsE <- calculateParamsFromIndParams(COHORT$PHENO_N, COHORT$EXPO_Mean, COHORT$EXPO_SD)

# Re-scale effect sizes as if estimated in a standardized model
std_betaG <- standardizeBeta(GWAS$MAIN_EFFECT, GWAS$INT_EFFECT, GWAS$FREQ_A0, parsE[1], parsE[2], type = "G")
std_betaI <- standardizeBeta(GWAS$MAIN_EFFECT, GWAS$INT_EFFECT, GWAS$FREQ_A0, parsE[1], parsE[2], type = "I")

# Estimation of the fraction of variance explained
fracG    <- calculateVarExp(std_betaG, std_betaI, C, parsY[2], sum(COHORT$PHENO_N), "G")
fracI    <- calculateVarExp(std_betaG, std_betaI, C, parsY[2], sum(COHORT$PHENO_N), "I")
fracJ    <- calculateVarExp(std_betaG, std_betaI, C, parsY[2], sum(COHORT$PHENO_N), "J")

Bug report / Help

Please open an issue if you find a bug.

Code of conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

License

This project is licensed under the MIT License - see the LICENSE.md file for details