Wrapper project to allow the Nyx tagger UI to call [`MaggotUBA`](https://gitlab.pasteur.fr/les-larves/structured-temporal-convolution).
Wrapper project to allow the Nyx tagger UI to call [`MaggotUBA`](https://gitlab.pasteur.fr/les-larves/structured-temporal-convolution/-/tree/light-stable-for-tagging).
This project heavily depends on the [`TaggingBackends`](https://gitlab.pasteur.fr/nyx/TaggingBackends) package that drives the development of automatic tagging backends.
This project heavily depends on the [`TaggingBackends`](https://gitlab.pasteur.fr/nyx/TaggingBackends) package that drives the development of automatic tagging backends.
## Principle
## Principle
MaggotUBA is an autoencoder trained on randomly sampled 20-time-step time segments drawn from the t5 and t15 databases.
MaggotUBA is an autoencoder trained on randomly sampled 20-time-step time segments drawn from the t5 and t15 databases, with a computational budget of 1000 training epochs.
In its original "unsupervised" or self-supervised form, it reconstructs series of spines from a compressed latent representation.
In its original "unsupervised" or self-supervised form, it reconstructs series of spines from a compressed latent representation.
For the automatic tagging, the encoder is extracted and a classifier is stacked atop the encoder.
For the automatic tagging, the encoder is extracted and a classifier is stacked atop the encoder.
...
@@ -28,7 +28,9 @@ It was trained on a subset of 5000 files from the t5 and t15 databases. Spines w
...
@@ -28,7 +28,9 @@ It was trained on a subset of 5000 files from the t5 and t15 databases. Spines w
## Usage
## Usage
A MaggotUBA-based tagger is typically called using the `poetry run scripts/tagging-backend` command from the `TaggingBackends` project.
For installation, see [TaggingBackends' README](https://gitlab.pasteur.fr/nyx/TaggingBackends/-/tree/dev#recommanded-installation-and-troubleshooting).
A MaggotUBA-based tagger is typically called using the `poetry run tagging-backend` command from the backend's project (directory tree).
All the [command arguments supported by `TaggingBackends`](https://gitlab.pasteur.fr/nyx/TaggingBackends/-/blob/dev/src/taggingbackends/main.py) are also supported by `MaggotUBA-adapter`.
All the [command arguments supported by `TaggingBackends`](https://gitlab.pasteur.fr/nyx/TaggingBackends/-/blob/dev/src/taggingbackends/main.py) are also supported by `MaggotUBA-adapter`.
...
@@ -37,7 +39,7 @@ All the [command arguments supported by `TaggingBackends`](https://gitlab.pasteu
...
@@ -37,7 +39,7 @@ All the [command arguments supported by `TaggingBackends`](https://gitlab.pasteu
Using the [`20221005`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/tree/20221005) branch, the `20221005` tagger can be called on a supported tracking data file with:
Using the [`20221005`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/tree/20221005) branch, the `20221005` tagger can be called on a supported tracking data file with:
```
```
poetry run scripts/tagging-backend predict --backend path/to/maggotuba-adapter --model-instance 20221005 path/to/datafile
poetry run tagging-backend predict --model-instance 20221005 <path/to/datafile>
A new model instance can be trained on a data repository, using the `dev` branch of `MaggotUBA-adapter` (soon the `main` branch; the `20221005` branch is also suitable) with:
A new model instance can be trained on a data repository, using the `dev` branch of `MaggotUBA-adapter` (soon the `main` branch; the `20221005` branch is also suitable) with:
```
```
poetry run scripts/tagging-backend train --backend path/to/maggotuba-adapter --model-instance tagger-namepath/to/repository
poetry run tagging-backend train --model-instance <tagger-name> <path/to/repository>
```
```
This will first load a pretrained model (`pretrained_models/default` in `MaggotUBA-adapter`) to determine additional parameters, such as whether to interpolate the spines or not and at which frequency, or the window length.
This will first load a pretrained model (`pretrained_models/default` in `MaggotUBA-adapter`) to determine additional parameters, such as whether to interpolate the spines or not and at which frequency, or the window length.
...
@@ -55,7 +57,7 @@ The current default pretrained model involves linearly interpolating the spines
...
@@ -55,7 +57,7 @@ The current default pretrained model involves linearly interpolating the spines
Alternative pretrained models can be specified using the `--pretrained-model-instance` option.
Alternative pretrained models can be specified using the `--pretrained-model-instance` option.
The data files are discovered in the repository and behavior tags are counted.
The data files are discovered in the repository and behavior tags are counted.
A subset of tags can be specified using the `--labels` option.
A subset of tags can be specified using the `--labels` option followed by a list of comma-separated tags.
A two-level balancing rule is followed to randomly select time segments and thus form a training dataset in the shape of a *larva_dataset* hdf5 file.
A two-level balancing rule is followed to randomly select time segments and thus form a training dataset in the shape of a *larva_dataset* hdf5 file.
See also the [`make_dataset.py`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/20221005/src/maggotuba/data/make_dataset.py) script.
See also the [`make_dataset.py`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/20221005/src/maggotuba/data/make_dataset.py) script.
...
@@ -63,4 +65,4 @@ See also the [`make_dataset.py`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter
...
@@ -63,4 +65,4 @@ See also the [`make_dataset.py`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter
Training operates in two steps, first pretraining the dense-layer classifier, second simultaneously fine-tuning the encoder and classifier.
Training operates in two steps, first pretraining the dense-layer classifier, second simultaneously fine-tuning the encoder and classifier.
See also the [`train_model.py`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/20221005/src/maggotuba/models/train_model.py) script.
See also the [`train_model.py`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/20221005/src/maggotuba/models/train_model.py) script.
This generates a new sub-directory in the `models` directory of the `MaggotUBA-adapter` project, which makes the trained model available for automatic tagging (*predict* command).
This generates a new sub-directory in the `models` directory of the `MaggotUBA-adapter` project, which makes the trained model discoverable for automatic tagging (*predict* command).