Skip to content
Snippets Groups Projects
Commit 2403e42b authored by Yoann Dufresne's avatar Yoann Dufresne
Browse files

update readme regarding the ongoing work

parent ed49295c
No related branches found
No related tags found
No related merge requests found
# 10X-deconvolve
# Linked Reads molecule separation
Trying to deconvolve single tag assignment for multiple molecules
A compilation of scripts and pipelines to count and extract scaffolds of barcodes from linked reads datasets.
## Nomenclature warnings
During the process of writing a scientific article, some of the datastructure names have been modified.
In this repository the majority of the names are old names.
So, here is a short list of equivalences:
- unit d-graph -> local clique pair
- udg -> lcp
- d²-graph (or d2-graph) -> lcp graph
- udg divergence = lcp weight
- udg edge distance = lcp edge weight
## Installation
......@@ -15,13 +25,13 @@ Install the package from the root directory.
## Scripts
For the majority of the scripts, argparse is used.
To know how to use it please use the -h option.
To know how to use it please use the -h command line option.
### Data simulation
* generate_fake_molecule_graph.py: Create a linear molecule graph, where the molecules are linked to the d molecules on their left and d molecules on their right.
* generate_fake_barcode_graph.py: Take a barcode graph as input (gexf formated) and outputs a barcode graph. The barcode graph is create by fusion of nodes from the molecule graph.
* generate_fake_barcode_graph.py: Take a barcode graph as input (gexf formatted) and outputs a barcode graph. The barcode graph is create by fusion of nodes from the molecule graph.
* use the snakefile "Snakemake_data_simu".
All the parameters can be an integer or a list of integer.
......@@ -43,20 +53,6 @@ Config parameters:
* to_d2_graph.py: Mount a barcode graph into memory and create a d2 graph from it.
* evaluate.py: take a d2 graph gexf file and and analyse it. Look for an approximation of the longest correct path to reconstruct a molecule graph. Take as input a d2 graph where the truth is known in the node names (the format used to create fake data).
* analyse_d2_tsv.py: Take an tsv optimization file of a d2 graph and look for the variables coverage. Outputs the missing variables (if exists).
## Run the tests
export PYTHONPATH=deconvolution/
pytest tests
export PYTHONPATH=
* d2_to_path.py: take a d2 graph as input and explore the nodes to extract a udg path.
## Tests for Cedric
```bash
snakemake -s Snakefile_data_simu --config n=10000 m=[4,6,8,10,12] m_dev=[0,0.5,1,2,3]
snakemake -s Snakefile_d2 --config input=[snake_exec/simu_bar_n10000_d5_m10-dev0.5.gexf,snake_exec/simu_bar_n10000_d5_m10-dev0.gexf,snake_exec/simu_bar_n10000_d5_m10-dev1.gexf,snake_exec/simu_bar_n10000_d5_m10-dev2.gexf,snake_exec/simu_bar_n10000_d5_m10-dev3.gexf,snake_exec/simu_bar_n10000_d5_m12-dev0.5.gexf,snake_exec/simu_bar_n10000_d5_m12-dev0.gexf,snake_exec/simu_bar_n10000_d5_m12-dev1.gexf,snake_exec/simu_bar_n10000_d5_m12-dev2.gexf,snake_exec/simu_bar_n10000_d5_m12-dev3.gexf,snake_exec/simu_bar_n10000_d5_m4-dev0.5.gexf,snake_exec/simu_bar_n10000_d5_m4-dev0.gexf,snake_exec/simu_bar_n10000_d5_m4-dev1.gexf,snake_exec/simu_bar_n10000_d5_m4-dev2.gexf,snake_exec/simu_bar_n10000_d5_m4-dev3.gexf,snake_exec/simu_bar_n10000_d5_m6-dev0.5.gexf,snake_exec/simu_bar_n10000_d5_m6-dev0.gexf,snake_exec/simu_bar_n10000_d5_m6-dev1.gexf,snake_exec/simu_bar_n10000_d5_m6-dev2.gexf,snake_exec/simu_bar_n10000_d5_m6-dev3.gexf,snake_exec/simu_bar_n10000_d5_m8-dev0.5.gexf,snake_exec/simu_bar_n10000_d5_m8-dev0.gexf,snake_exec/simu_bar_n10000_d5_m8-dev1.gexf,snake_exec/simu_bar_n10000_d5_m8-dev2.gexf,snake_exec/simu_bar_n10000_d5_m8-dev3.gexf]
```
* evaluate.py: take a d2 graph gexf file and and analyse it. Look for an approximation of the longest correct path to reconstruct a molecule graph. Take as input a d2 graph where the truth is known in the node names (the format used to create fake data).
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment