Skip to content
Snippets Groups Projects
Select Git revision
  • master
1 result

manocca

  • Clone with SSH
  • Clone with HTTPS
  • user avatar
    Christophe Boetto authored
    96857add
    History
    Name Last commit Last update
    R
    python
    README.md

    Gitlab repository for the MANOCCA project

    Tool used to compute the covariance test on a set of predictors, presented in this paper : https://academic.oup.com/bib/article/25/4/bbae272/7690346 MANOCCA (Multivariate ANalysis of Conditional CovAriance) performs the covariance test :

    Cov(Y1,Y2)X+CCov(Y_1,Y_2) \sim X + C
    , as well as the multivariate version with more than two Y outcomes
    Cov(Y)X+CCov(Y) \sim X + C
    . In few words, MANOCCA (Multivariate ANalysis of Conditional CovAriance) is a test of differences in correlations/covariance with regard to a predictor. The test is orthogonal to mean and variance effects, and will only grasp covariance effects.

    One application of the MANOCCA test to detect co-abunding human gut microbiome communities can be found in this preprint : https://www.medrxiv.org/content/10.1101/2024.04.30.24306630v1.full

    Note : If all outcomes are centered and scaled, then

    Corr(Y1,Y2)=Cov(Y1,Y2)Corr(Y_1, Y_2) = Cov(Y_1,Y_2)
    and MANOCCA can test for changes in correlation. The subtleties of covariance vs correlation are discussed in the main draft and supplements of the paper above.

    Author : Christophe BOETTO

    Archived commit pinned for the article https://academic.oup.com/bib/article/25/4/bbae272/7690346 in case the tool evolves in the future : 6d76df07 To go to this specific commit do :

    • git clone <manocca_https_link>
    • cd manocca
    • git reset --hard 6d76df07

    Contents

    The current repository contains the Python and R versions of MANOCCA. The python is more detailed than the R version, but they both provide the same test. In the Python version you will find :

    • MANOCCA - the multivariate test on covariance
    • MANOVA - the multivariate test on mean
    • Explainer - A tool to help explain the results from a MANOCCA test. You can for instance use it to check the power with regard to the number of PCs kept, of the main loadings of some PCs.
    • the rest is mostly functionalities under development.

    Python Installation

    Conda install (recommended)

    Start by moving the the python file in the manocca project :

    cd  /PATH/TO/manocca/python

    Then import the conda environment from environment.yaml and activate it:

    conda env create -f environment.yml -n py310_manocca
    conda activate py310_manocca

    Then install MANOCCA using the setup file :

    pip install .

    At that point the script example of Manocca should be running (Example_manocca_script.py):

    python Example_manocca_script.py

    And should return a p-value for a randomnly generated multivariate outcome and predictor.

    If you want to run the more detailed jupyte notebook example, you need ipykernel:

    conda install ipykernel
    python -m ipykernel install --user --name py310_manocca --display-name "py310_manocca"

    Then depending on your setup, if you are in VScode, you can open the notebook and select py310_manocca as python environment for the kernel, or you can manually launch jupyter notebook. Typically (ex: on a cluster) Jupyter Notebook is installed in the base environment, so first exit py310_manocca env:

    conda deactivate py310_manocca

    then launch jupyter

    jupyter notebook

    Note : On a cluster if you can't find py310_manocca you might need to reset the running cluster node to have it detect the new environment.

    Pip install

    You can also pip install manually each library, you will need the following ones :

    pandas
    numpy
    scipy
    scikit-learn
    tqdm
    joblib
    matplotlib

    Then to run the notebook you will also need :

    ipykernel

    Docker

    In the Python repository you will find a Dockerfile, which can be build and ran interactively. Follow these steps :

    • Download MANOCCA repository, and go in the Python directory
    • Launch you docker engine (ex : launch the docker application)
    • Then run the following command in your terminal while in the the MANOCCA/python directory (or prompt in windows)
    • docker build -t manocca_docker .
    • docker run -it manocca_docker

    This will allow you to run an interactive terminal with the right environment to make MANOCCA work. If you want to use your personal data, you can add a repository in MANOCCA/python/your_data_directory and the data will be copied in the docker build command and you can access it within the interactive docker. If your dataset is too big, consider looking at the docker documentation to access data from an interactive container. Remark : jupyter notebook might be tricky to run from a container, so use the script Example_manocca_script.py to get started.

    Getting started

    You can start by taking a look at some examples in the Python repository :

    • Jupyter notebook Example_manocca.ipynb
    • Python script : Example_manocca_script.py

    Steps to launche the jupyter notebook :

    • open Terminal or windows prompt
    • go to the MANOCCA/python/ directory
    • launch virtual environment like stated above
    • write : jupyter notebook
    • wait for your web navigator to open, or paste link in the terminal
    • open : Example_manocca.ipynb

    R Installation

    Run the following commands in your R terminal to install the required libraries :

    install.packages("clusterGeneration")
    install.packages("mvtnorm")
    install.packages("RNOmni")

    Then the script should be compiling, and you can use :

    manocca(Y, X, C, n_comp)

    To test

    Cov(Y) X+CCov(Y) ~ X + C
    while keeping n_comp principal components in the analysis.