From 0c242a9913ea8dfe50bd3a8a6212678e9ef9354f Mon Sep 17 00:00:00 2001
From: Oceane <oceane.fourquet@outlook.fr>
Date: Wed, 8 Feb 2023 17:00:01 +0100
Subject: [PATCH] small changes readme

---
 README.md  | 25 ++++++++++++++++++-------
 run_mem.py |  6 ++++++
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index 6f5a645..1055359 100644
--- a/README.md
+++ b/README.md
@@ -2,28 +2,39 @@
 
 Author: Océane FOURQUET
 
-This project aims to construct an ensemble model of bidimensional monotonic classifiers[1](https://link.springer.com/article/10.1007/s00453-012-9628-4). In addition to reimplementing an established approach[2](https://academic.oup.com/jid/article/217/11/1690/4911472?login=true) in python, it integrates a preselection of the pairs of features, reducing drastically the running time of the approach.
+This project constructs an ensemble model of bidimensional monotonic classifiers [1](https://link.springer.com/article/10.1007/s00453-012-9628-4). In addition to reimplementing an established approach [2](https://academic.oup.com/jid/article/217/11/1690/4911472?login=true) in Python, it integrates a preselection of the pairs of features, reducing drastically the running time of the approach.
 
-## Python and librairies versions
+## Python and Libraries Versions
 - python 3.9.1
 - pandas 1.2.3
 - numpy 1.19.2
 - matplotlib 3.3.4
 - multiprocessing
 
-This code uses the multiprocessing library to parallelise calculations. By default, the number of CPUs is set to 1, but it is possible to change this configuration with the parameters.
+This code uses the multiprocessing library to parallelize calculations. By default, the number of CPUs is set to 1, but it is possible to change this configuration with the parameters.
 
-## Run the code
+## How to Run the Code
+To use this code, your data needs to be prepared and pretreated in advance. This code does not realize any pretreatment, normalization, etc.
 
 ### Data
-Data should be presented in csv format, in the form of samples per row and features per column, with a labels column for classes. For the moment, this code only allows to work on data with two classes.
+Data should be presented in csv format, in the form of samples per row and features per column, with a column for classes. For the moment, this code only allows to work on data with two classes.
 
 ### Code
-To run the approach with the default parameters, just run the following command in your shell :
+To run the approach with the default parameters, just run the following command in your shell:
 ```python
 python3 run_mem.py <dataset>
 ```
-There are some optional parameters such as the number of cpus to use for calculations, the maximum number of pairs for the ensemble model, the name of the label column in the dataset file, etc. For more information, run the command:
+with <dataset> corresponding to the dataset file and repository.
+
+
+There are some optional parameters, such as the number of CPUs to use for the calculations, the maximum number of pairs for the ensemble model, the name of the label column in the dataset file, etc. By default, the outputs of the code are stored in the current repository, but it can be configured differently with the optional parameter --outdir . The ouput files comprise a txt file with the final pair classifiers for the ensemble model, a pdf file with the roc curve and pdf files visualizing the final pair classifiers.
+
+For more information about parameters, run the command:
 ```python
 python3 run_mem.py -h
 ```
+
+
+## References
+[1] Stout, Q.F. Isotonic Regression via Partitioning. Algorithmica 66, 93–112 (2013). https://doi.org/10.1007/s00453-012-9628-4
+[2] Iryna Nikolayeva, Pierre Bost, Isabelle Casademont, Veasna Duong, Fanny Koeth, Matthieu Prot, Urszula Czerwinska, Sowath Ly, Kevin Bleakley, Tineke Cantaert, Philippe Dussart, Philippe Buchy, Etienne Simon-Lorière, Anavaj Sakuntabhai, Benno Schwikowski, A Blood RNA Signature Detecting Severe Disease in Young Dengue Patients at Hospital Arrival, The Journal of Infectious Diseases, Volume 217, Issue 11, Pages 1690–1698 (2018) https://doi.org/10.1093/infdis/jiy086
diff --git a/run_mem.py b/run_mem.py
index c23dbc4..96da666 100644
--- a/run_mem.py
+++ b/run_mem.py
@@ -88,6 +88,12 @@ def main():
     else:
         sr.show_results_neg(df, pairs, inputs.nbcpus, inputs.outdir, cm)
 
+    output_file = inputs.outdir + '/output.txt'
+    f = open(output_file.replace('//', '/'), 'w')
+    for pair in pairs:
+        f.write('{} \n'.format(pair[:-2]))
+    f.close()
+
 
 
 
-- 
GitLab