Merge branch '20221005-doc' into dev

5db50cae · François LAURENT · 7fc8d0c0 · 12eb1052 · 5db50cae
Commit 5db50cae authored 2 years ago by François LAURENT
--- a/README.md
+++ b/README.md
@@ -12,16 +12,16 @@ In its original "unsupervised" or self-supervised form, it reconstructs series o
 For the automatic tagging, the encoder is extracted and a classifier is stacked atop the encoder.
 On the same dataset, the combined encoder and classifier are (re-)trained to predict discrete behaviors.

-## Prototypes
-
-Two flavors have been tried out as proofs of concept, although no thorough validation has been performed on data yet.
+## Prototypes and validated taggers

 As a first prototype, the [`20220418`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/tree/20220418) trained model is based on a simple random forest classifier, and only the classifier was trained; the encoder was not retrained.
-See module [`maggotuba.models.randomforest`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/dev/src/maggotuba/models/randomforest.py).
+See module [`maggotuba.models.randomforest`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/20220418/src/maggotuba/models/randomforest.py).
+
+It was trained on the entire t5+t15 database. No interpolation was performed and the prototype does not properly handle data with different frame rates.

-A second prototype called `20220517` (not shared) involves a dense layer as classifier, and the encoder was fine-tuned while training the combined encoder+classifier.
-See module [`maggotuba.models.denselayer`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/dev/src/maggotuba/models/denselayer.py).
+A second tagger called [`20221005`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/tree/20221005) involves a classifier with dense layers, and the encoder was fine-tuned while training the combined encoder+classifier.
+See modules [`maggotuba.models.trainers`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/20221005/src/maggotuba/models/trainers.py) and [`maggotuba.models.modules`](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/20221005/src/maggotuba/models/modules.py).

-The flavor this second prototype is based on will implement retraining on arbritrary data repositories and discrete behaviors.
+This second tagger was dimensioned following a [parametric exploration for the 6-behavior classification task](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter/-/blob/design/notebooks/parametric_exploration_6-behavior_classification.ipynb): 2-second time segments, 100-dimension latent space, 3 dense layers.

-If used on data for scientific purposes, as the prototypes are not validated, at best they allow **assisted manual tagging**.
+It was trained on a subset of 5000 files from the t5 and t15 databases. Spines were/are linearly interpolated at 10 Hz in each time segment individually.