README.md 4.97 KB
Newer Older
1
# Metagenedb
Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
2

3
4
[![pipeline status](https://gitlab.pasteur.fr/metagenomics/metagenedb/badges/master/pipeline.svg)](https://gitlab.pasteur.fr/metagenomics/metagenedb/commits/dev)
[![coverage report](https://gitlab.pasteur.fr/metagenomics/metagenedb/badges/master/coverage.svg)](https://gitlab.pasteur.fr/metagenomics/metagenedb/commits/dev)
Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
5

Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
6
Django based project to build genes catalog and tools
7
8
to play with it and contact external services.

9
----
10

11
## Setup the services on your local machine
12

13
### Dependencies
14

Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
15
16
The application depends on different services that run independently on docker images and all of this is
orchestrated by `docker-compose`.
17

Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
18
Therefore to run the application you need:
19

Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
20
21
22
* `Docker` : [Install instructions](https://docs.docker.com/install/)
* `Docker Compose` : [Install instructions](https://docs.docker.com/compose/install/)

23
### Configuration
Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
24

25
For `docker-compose`, you need to create a `.env` file: `touch .env`. An example is available: `.env.sample`.
26

Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
27
28
29
30
31
The settings of the Django server is based on the `backend/.env` file. You can copy the sample file
(`cp backend/.env.sample backend/.env`) and fill in the variables.

You can of course customize more of the Django server settings in the `settings` module of metagenedb.

32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
Now we will go through the different parts

#### Secret key

This is the Django `SECRET_KEY` and you need to specify your own. For instance you can use the command
`openssl rand -base64 32` to generate one by command line.

#### Create your own DB on postgresql

The following variables have the default value:

```bash
DATABASE_HOST=postgresql
DATABASE_USER=postgres
DATABASE_NAME=postgres
DATABASE_PASSWORD=""
DATABASE_PORT=5432
```

It will work if you leave it as it is but you might face security issues having a by default database
without credentials.

What we recommand is to create your own database. Here is described one way to do it. To do that you need to
first run the db image and identify its running ID:

```bash
khillion:~/metagenedb $ docker-compose up postgresql -d  # This runs only the postgresql service of your docker-compose in detached mode. You can also detached from you running screen using Ctrl+Z
Creating postgresql ... done
khillion:~/metagenedb $ docker ps  # List your running docker images
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS                    NAMES
5002f210f9d8        postgres:11.4-alpine   "docker-entrypoint.s…"   1 minute ago      Up 1 minute       0.0.0.0:5433->5432/tcp   postgresql
```

Now that you have the `CONTAINER ID`, here `5002f210f9d8` you can run a `bash` terminal in this container and
create your own database:

```bash
khillion:~/metagenedb $ docker exec -it 5002f210f9d8 bash
bash-5.0# psql --user=postgres
````

This will open the `SQL` console where you can do what you need:

```psql
CREATE ROLE metagenedb WITH PASSWORD 'yourawesomepassword';
ALTER ROLE metagenedb WITH CREATEDB;
CREATE DATABASE metagenedb WITH OWNER metagenedb;
exit
```

Now you have you own database, protected by a password and you need to update your `.env`:

```bash
DATABASE_HOST=postgresql
DATABASE_USER=metagenedb
DATABASE_NAME=metagenedb
DATABASE_PASSWORD=yourawesomepassword
DATABASE_PORT=5432
```

> **Note**: The by default port for postgres is `5432`. In the `docker-compose.yaml` you will notice that this
port is redirected to `5433` on the `localhost`. This is done in order to not interfere with your local
postgres if you have one. This means you need to change `DATABASE_HOST` to `localhost` and `DATABASE_POS

----

## Run the application
Kenzo-Hugo Hillion's avatar
Kenzo-Hugo Hillion committed
99

100
101
102
For the moment, only the `docker-compose.dev.yaml` is used. To run the application simply run the command:

```bash
103
docker-compose up --build
104
105
106
107
108
109
110
111
```

The `--build` option is only necessary during the first usage or when you make changes that need the docker
container to be built again.

Since directories with source codes are mounted in the containers, changes you make locally should be
directly reflected on the application.

112
### Populate the database
113
114
115
116
117
118
119
120
121
122
123
124
125

You have a set of scripts available within the `backend/scripts` directory that you can execute directly
from within the container. First identify the container ID corresponding to the backend with `docker ps` command. Then you can execute a bash terminal within the container and execute the scripts you want:

```bash
docker exec -it YOURCONTAINER_ID bash
root@YOURCONTAINER_ID:/code# python scripts/script.py
```

For the moment you can:

* Import all kegg orthologies with `load_kegg.py`: It directly fetch all KEGGs KO from the KEGG REST API.
* Import genes from IGC catalog from the [annotation file](ftp://ftp.cngb.org/pub/SciRAID/Microbiome/humanGut_9.9M/GeneAnnotation/IGC.annotation_OF.summary.gz). You can a small part of this annotation file in the `dev_data` folder.
126
127
128
129

> **Note**: You can also execute the scripts locally from a `pipenv shell` for instance. You need to make
sure that you change the way to log to postgres since the access is different from your machine compared to
from a container.