Gitlab repository for the MANOCCA project
Tool used to compute the covariance test on a set of predictors, presented in this paper : https://academic.oup.com/bib/article/25/4/bbae272/7690346 MANOCCA (Multivariate ANalysis of Conditional CovAriance) performs the covariance test :
One application of the MANOCCA test to detect co-abunding human gut microbiome communities can be found in this preprint : https://www.medrxiv.org/content/10.1101/2024.04.30.24306630v1.full
Note : If all outcomes are centered and scaled, then
Author : Christophe BOETTO
Archived commit pinned for the article https://academic.oup.com/bib/article/25/4/bbae272/7690346 in case the tool evolves in the future : 6d76df07 To go to this specific commit do :
- git clone <manocca_https_link>
- cd manocca
- git reset --hard 6d76df07
Contents
The current repository contains the Python and R versions of MANOCCA. The python is more detailed than the R version, but they both provide the same test. In the Python version you will find :
- MANOCCA - the multivariate test on covariance
- MANOVA - the multivariate test on mean
- Explainer - A tool to help explain the results from a MANOCCA test. You can for instance use it to check the power with regard to the number of PCs kept, of the main loadings of some PCs.
- the rest is mostly functionalities under development.
Python Installation
Conda install (recommended)
Start by moving the the python file in the manocca project :
cd /PATH/TO/manocca/python
Then import the conda environment from environment.yaml and activate it:
conda env create -f environment.yml -n py310_manocca
conda activate py310_manocca
Then install MANOCCA using the setup file :
pip install .
At that point the script example of Manocca should be running (Example_manocca_script.py):
python Example_manocca_script.py
And should return a p-value for a randomnly generated multivariate outcome and predictor.
If you want to run the more detailed jupyte notebook example, you need ipykernel:
conda install ipykernel
python -m ipykernel install --user --name py310_manocca --display-name "py310_manocca"
Then depending on your setup, if you are in VScode, you can open the notebook and select py310_manocca as python environment for the kernel, or you can manually launch jupyter notebook. Typically (ex: on a cluster) Jupyter Notebook is installed in the base environment, so first exit py310_manocca env:
conda deactivate py310_manocca
then launch jupyter
jupyter notebook
Note : On a cluster if you can't find py310_manocca you might need to reset the running cluster node to have it detect the new environment.
Pip install
You can also pip install manually each library, you will need the following ones :
pandas
numpy
scipy
scikit-learn
tqdm
joblib
matplotlib
Then to run the notebook you will also need :
ipykernel
Docker
In the Python repository you will find a Dockerfile, which can be build and ran interactively. Follow these steps :
- Download MANOCCA repository, and go in the Python directory
- Launch you docker engine (ex : launch the docker application)
- Then run the following command in your terminal while in the the MANOCCA/python directory (or prompt in windows)
- docker build -t manocca_docker .
- docker run -it manocca_docker
This will allow you to run an interactive terminal with the right environment to make MANOCCA work. If you want to use your personal data, you can add a repository in MANOCCA/python/your_data_directory and the data will be copied in the docker build command and you can access it within the interactive docker. If your dataset is too big, consider looking at the docker documentation to access data from an interactive container. Remark : jupyter notebook might be tricky to run from a container, so use the script Example_manocca_script.py to get started.
Getting started
You can start by taking a look at some examples in the Python repository :
- Jupyter notebook Example_manocca.ipynb
- Python script : Example_manocca_script.py
Steps to launche the jupyter notebook :
- open Terminal or windows prompt
- go to the MANOCCA/python/ directory
- launch virtual environment like stated above
- write : jupyter notebook
- wait for your web navigator to open, or paste link in the terminal
- open : Example_manocca.ipynb
R Installation
Run the following commands in your R terminal to install the required libraries :
install.packages("clusterGeneration")
install.packages("mvtnorm")
install.packages("RNOmni")
Then the script should be compiling, and you can use :
manocca(Y, X, C, n_comp)
To test