Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • nyx/larvatagger.jl
1 result
Show changes
Commits on Source (9)
Showing with 1606 additions and 575 deletions
This diff is collapsed.
This diff is collapsed.
name = "LarvaTagger"
uuid = "8b3b36f1-dfed-446e-8561-ea19fe966a4d"
authors = ["François Laurent", "Institut Pasteur"]
version = "0.19.1"
version = "0.20"
[deps]
Bonito = "824d6782-a2ef-11e9-3a09-e5662e0c26f8"
......@@ -9,6 +9,8 @@ Colors = "5ae59095-9a9b-59fe-a467-6f913c188581"
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
DocOpt = "968ba79b-81e4-546f-ab3a-2eecfa62a9db"
Format = "1fa38f19-a742-5d3f-a2b9-30dd87b9d5f8"
HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
......@@ -18,6 +20,7 @@ NyxWidgets = "c288fd06-43d3-4b04-8307-797133353e2e"
Observables = "510215fc-4207-5dde-b226-833fc4488ee2"
ObservationPolicies = "6317928a-6b1a-42e8-b853-b8e2fc3e9ca3"
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
Oxygen = "df9a0d86-3283-4920-82dc-4555fc0d1d8b"
PlanarLarvae = "c2615984-ef14-4d40-b148-916c85b43307"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
......
......@@ -16,6 +16,8 @@ The *LarvaTagger* project is divided into several components. Although this READ
This package features a web-based graphical user interface (GUI) for visualizing the tracked larvae and assigning discrete behavior tags, at each time step.
As a web GUI, a public instance can be found at [nyx.pasteur.cloud](https://nyx.pasteur.cloud/larvatagger). Demo data can be found in the [Quick start](#quick-start-with-docker) section below.
A command-line interface (CLI) is also available for batch processing, including the automatic tagging of track data files, training new taggers from labeled data, etc.
Although *LarvaTagger.jl* alone comes with no automatic tagger, it is designed to work primarily in combination with [*MaggotUBA*](https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter) for the identification of larval actions or postures.
......@@ -53,7 +55,7 @@ Change directory (`cd`) to the Downloads directory. In this example, we will ass
On macOS and Linux, change the permissions of the script file so that it can be executed:
```
chmod a+x larvatagger.sh
chmod +x larvatagger.sh
```
The demo data can be opened in the web browser for visual inspection, on macOS and Linux with:
......@@ -111,15 +113,12 @@ curl -sSL "https://gitlab.pasteur.fr/nyx/larvatagger.jl/-/raw/dev/scripts/instal
```
In the latter case, the script may install several extra dependencies, but not all of them.
In particular, Python is required; either 3.8 with `--with-default-backend`, or 3.11 with `--with-backend --experimental`.
In particular, Python is required; either 3.11 with `--with-default-backend`, or 3.8 with `--with-backend --legacy`.
If `pyenv` is available, the script will use this tool to install Python.
Otherwise, `python3.8` and `python3.8-venv` may have to be manually installed.
Otherwise, `python3.11` and `python3.11-venv` may have to be manually installed.
On WSL, the script will attempt to install `pyenv` and Python (tested with Ubuntu 20.04).
On macOS, the full LarvaTagger suite can be installed only with the `--with-backend --experimental` options:
```
curl -sSL "https://gitlab.pasteur.fr/nyx/larvatagger.jl/-/raw/dev/scripts/install.sh?ref_type=heads&inline=false" | /bin/bash -s -- --with-backend --experimental
```
On macOS, the full LarvaTagger suite can be installed only with the default options (`--legacy` is not supported).
This script can also uninstall LarvaTagger (if installed with the same script) with: `curl -sSL "https://gitlab.pasteur.fr/nyx/larvatagger.jl/-/raw/dev/scripts/install.sh?ref_type=heads&inline=false" | /bin/bash -s -- --uninstall` which can be useful for example prior to reinstalling after failure.
......@@ -135,12 +134,6 @@ cd LarvaTagger
julia --project=. -e 'using Pkg; Pkg.instantiate()'
```
In May 2024, the gitlab.pasteur.fr server began to request an authentication token on cloning public repositories.
If the `git clone` command requests an authentication token you do not have, do instead:
```
curl -sSL https://gitlab.pasteur.fr/nyx/larvatagger.jl/-/archive/main/larvatagger.jl-main.tar.gz | tar zxv && mv larvatagger.jl-main LarvaTagger
```
Calling `Pkg.instantiate` in a copy of the project is preferred over using `Pkg.add`,
because *LarvaTagger.jl* depends on several unregistered packages.
......@@ -213,19 +206,6 @@ The `--browser` argument may open a new tab in your web browser, but this featur
The first time the application is loaded, it may take a while for a window in your web browser to open, and the data to be plotted.
### From the *Julia* interpreter
As an alternative to the *larvatagger* script or command, in the *LarvaTagger* directory created above, launch the *Julia* interpreter:
```
julia --project=.
```
In the interpreter, to launch the editor, type:
```
julia> using LarvaTagger; display(larvaeditor("path/to/data/file"))
```
To exit the interpreter, type `exit()` or press Ctrl+D.
### macOS
On some computers (typically macOS computers), the 2D larva view may show up twice as small as expected.
......@@ -234,7 +214,7 @@ Feel free to adjust the value if the 2D view is too small or large.
## Automatic tagging
*LarvaTagger.jl* comes with no automatic tagger per default, unless run using Docker or installed with the *scripts/install.sh* script and the `--with-default-backend` option.
*LarvaTagger.jl* comes with no automatic tagger per default, unless you run the Docker image or you installed LarvaTagger with the *scripts/install.sh* script and the `--with-default-backend` option.
To extend the editor with automatic tagging capabilities, see the [recommended installation steps for *TaggingBackends* and *MaggotUBA*](https://gitlab.pasteur.fr/nyx/TaggingBackends#recommended-installation).
......
......@@ -22,4 +22,42 @@ On the *Julia* side, the lower-level functionalities are provided by the *Planar
Similarly, the *TidyObservables.jl* project has unit tests and a GitLab workflow to run these tests on every commit.
For the remaining part of the *LarvaTagger* project, high-level functional tests only are available.
These tests are available in the *LarvaTagger.jl* project, in the test directory, file *scenarii.sh*. They depend on [*shUnit2*](https://github.com/kward/shunit2).
These tests are available in the *LarvaTagger.jl* project, as file `test/deploy_and_test.sh`. They depend on [*shUnit2*](https://github.com/kward/shunit2).
The `test/deploy_and_test.sh` script implicitly tests the `scripts/install.sh` script, and then runs the `test/scenarii.sh` script.
The `test/scenarii.sh` script requires some test data that are downloaded by the `test/deploy_and_test.sh` script.
If these test data cannot be fetched, please contact François Laurent so that the data are made available again (the download link periodically expires).
## REST
The REST API can be tested running the `test/rest_server.sh` and `test/rest_client.sh` scripts.
The `test/rest_server.sh` script is fully automatic and completes with a stack trace that results from killing the backend after the last test.
The `test/rest_client.sh` script does not perform any actual test.
It launches a backend and a frontend, and the user is expected to manually operate the frontend to test the communication between the backend and the frontend.
## Docker images
The most complete test image can be built as follows:
```
docker=docker LARVATAGGER_IMAGE=flaur/larvatagger:latest scripts/larvatagger.sh build --dev
docker build -t larvatagger:bigfat -f recipes/Dockerfile.pasteurjanelia .
```
Subsequent tests will run the backend using `larvatagger:bigfat` image:
```
docker=docker LARVATAGGER_IMAGE="larvatagger:bigfat" scripts/larvatagger.sh backend
```
The frontend is more conveniently run from the source tree:
```
scripts/larvatagger-gui.jl http://localhost:9285
```
The `docker=docker` environment variable is required if command `podman` is available.
The `scripts/larvatagger.sh` script falls back on using `podman` instead of `docker`, if `podman` is available, but it is recommended to perform tests using Docker.
In addition, at present, building the image works with Docker buildx only.
See also the `recipes/release.sh` script.
FROM julia:1.10.8-bullseye AS base
FROM julia:1.10.9-bullseye AS base
ARG PROJECT_DIR=/app
ARG BRANCH=main
......@@ -82,7 +82,9 @@ RUN if [ -z $TAGGINGBACKENDS_BRANCH ]; then \
&& poetry add "pynvml==11.4.1" \
&& if [ "$(echo $BACKEND | cut -d/ -f2)" = "main" ] || [ "$(echo $BACKEND | cut -d/ -f2)" = "dev" ]; then \
julia -e 'using Pkg; Pkg.add("JSON3")' \
&& scripts/make_models.jl default; \
&& scripts/make_models.jl default \
&& cd $PROJECT_DIR \
&& recipes/patch.sh; \
fi \
&& rm -rf ~/.cache; \
fi
......
# To be built with scripts/larvatagger.sh build --dev
FROM julia:1.8.2-bullseye
FROM julia:1.10.9-bullseye
ENV JULIA_PROJECT=/app/TaggingBackends
ENV JULIA_DEPOT_PATH=/usr/local/share/julia
ENV POETRY_VIRTUALENVS_PATH=/usr/local/share/poetry
# We assume:
# * current directory name is LarvaTagger; contains the LarvaTagger.jl project;
# * PlanarLarvae.jl project is available as sibling directory PlanarLarvae;
# * current directory name is LarvaTagger.jl; contains the LarvaTagger.jl project;
# * PlanarLarvae.jl project is available as sibling directory PlanarLarvae.jl;
# * TaggingBackends project is available as sibling directory TaggingBackends;
# * MaggotUBA-core project is available as sibling directory MaggotUBA-core;
# * MaggotUBA-adapter project is available as sibling directory MaggotUBA-adapter.
# Paths are given relative to parent directory, since larvatagger.sh will move 1 level up
# prior to calling docker build
ARG PLANARLARVAE=./PlanarLarvae
ARG LARVATAGGER=./LarvaTagger
ARG PLANARLARVAE=./PlanarLarvae.jl
ARG LARVATAGGER=./LarvaTagger.jl
ARG TAGGINGBACKENDS=./TaggingBackends
ARG MAGGOTUBA_CORE=./MaggotUBA-core
ARG MAGGOTUBA_ADAPTER=./MaggotUBA-adapter
COPY $PLANARLARVAE/src /app/PlanarLarvae/src
COPY $PLANARLARVAE/Project.toml $PLANARLARVAE/Manifest.toml /app/PlanarLarvae/
COPY $PLANARLARVAE/Project.toml $PLANARLARVAE/Manifest.toml* /app/PlanarLarvae/
COPY $LARVATAGGER/src /app/src
COPY $LARVATAGGER/scripts /app/scripts
COPY $LARVATAGGER/recipes/patch.sh /app/recipes/
COPY $LARVATAGGER/Project.toml $LARVATAGGER/Manifest.toml /app/
RUN apt-get update \
......@@ -43,7 +44,8 @@ COPY $MAGGOTUBA_CORE/src /app/MaggotUBA-core/src
COPY $MAGGOTUBA_CORE/pyproject.toml /app/MaggotUBA-core/
COPY $MAGGOTUBA_ADAPTER/src /app/MaggotUBA/src
COPY $MAGGOTUBA_ADAPTER/pyproject.toml /app/MaggotUBA/
COPY $MAGGOTUBA_ADAPTER/models/20221005 /app/MaggotUBA/models/20221005
COPY $MAGGOTUBA_ADAPTER/models/20230311 /app/MaggotUBA/models/20230311
COPY $MAGGOTUBA_ADAPTER/models/20230311-0 /app/MaggotUBA/models/20230311-0
COPY $MAGGOTUBA_ADAPTER/pretrained_models/default /app/MaggotUBA/pretrained_models/default
RUN python3 -m pip install poetry \
......@@ -54,6 +56,8 @@ RUN python3 -m pip install poetry \
&& cd /app/MaggotUBA \
&& poetry add ../MaggotUBA-core \
&& poetry add ../TaggingBackends \
&& poetry install
&& poetry install \
&& cd /app \
&& recipes/patch.sh
ENTRYPOINT ["larvatagger.jl"]
......@@ -27,5 +27,6 @@ RUN cd $PROJECT_DIR \
&& make package \
&& rm -rf bin/matlab/2023b/bin/glnxa64/matlab_startup_plugins/matlab_graphics_ui \
&& rm -rf bin/matlab/2023b/bin/glnxa64/matlab_startup_plugins/foundation/platform/pf_matlab_integ \
&& rm -rf .git ~/.cache
&& rm -rf .git ~/.cache \
&& cd .. && recipes/patch.sh
......@@ -149,10 +149,6 @@ Optionally, you can also get the default backend (currently *20230311*) with:
```
scripts/larvatagger.sh build --with-default-backend
```
Currently, Docker images on Docker Hub are built with:
```
scripts/larvatagger.sh --target confusion build --with-default-backend
```
If you want another tagger, *e.g.* the *20230129* tagger implemented by the *20230129* branch of the *MaggotUBA-adapter* repository, do:
```
......@@ -177,7 +173,7 @@ docker pull flaur/larvatagger
```
Beware that images that ship with a tagging backend are relatively large files (>5GB on disk).
If you are not interested in automatic tagging, use the `flaur/larvatagger:0.19-standalone` image instead.
If you are not interested in automatic tagging, use the `flaur/larvatagger:0.20-standalone` image instead.
### Upgrading
......
#!/bin/sh
# patch the taggers in a Docker image so that they include metadata.json files;
# ultimately the taggers should manage their metadata.json files themselves.
if [ -d "MaggotUBA" ]; then
if ! [ -f "MaggotUBA/metadata.json" ]; then
cat <<"EOF" >MaggotUBA/metadata.json
{
"name": "MaggotUBA",
"homepage": "https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter",
"description": "Action classifiers based on MaggotUBA encoders"
}
EOF
fi
dir="MaggotUBA/models/20230311"
if [ -d "$dir" ] && ! [ -f "$dir/metadata.json" ]; then
cat <<"EOF" >$dir/metadata.json
{
"name": "20230311",
"homepage": "https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter#20230311-0-and-20230311",
"description": "Tagger trained on t15 to emulate JBM's tagger on the 7-behavior classification task"
}
EOF
fi
dir="MaggotUBA/models/20230311-0"
if [ -d "$dir" ] && ! [ -f "$dir/metadata.json" ]; then
cat <<"EOF" >$dir/metadata.json
{
"name": "20230311-0",
"homepage": "https://gitlab.pasteur.fr/nyx/MaggotUBA-adapter#20230311-0-and-20230311",
"description": "Tagger trained on t15 to emulate JBM's tagger on the 12-behavior classification task"
}
EOF
fi
fi
# note about make_dataset:
# * default is equivalent to "train, finetune"
# * can be "always" or a comma-separated list of processing steps
# * valid steps are "train", "finetune", "predict", "embed"
if [ -d "PasteurJanelia" ]; then
if ! [ -f "PasteurJanelia/metadata.json" ]; then
cat <<"EOF" >PasteurJanelia/metadata.json
{
"name": "PasteurJanelia",
"homepage": "https://gitlab.pasteur.fr/nyx/PasteurJanelia-adapter",
"description": "Action classifiers initially designed by JBM at Janelia",
"make_dataset": "always"
}
EOF
fi
dir="PasteurJanelia/models/5layers"
if [ -d "$dir" ] && ! [ -f "$dir/metadata.json" ]; then
cat <<"EOF" >$dir/metadata.json
{
"name": "5layers",
"homepage": "https://gitlab.pasteur.fr/nyx/PasteurJanelia-adapter",
"description": "JBM's final tagger for use on the t2 and t7 trackers"
}
EOF
fi
fi
#!/bin/sh
set -e
RELEASE=$1
if [ -z "$RELEASE" ]; then
echo "Usage: $0 <version-number>"
exit 1
fi
docker=docker LARVATAGGER_IMAGE=flaur/larvatagger:$RELEASE-standalone scripts/larvatagger.sh build
docker=docker LARVATAGGER_IMAGE=flaur/larvatagger:$RELEASE-20230311 scripts/larvatagger.sh --target confusion build --with-default-backend
docker tag flaur/larvatagger:$RELEASE-20230311 flaur/larvatagger:latest
docker build -t flaur/larvatagger:$RELEASE-bigfat -f recipes/Dockerfile.pasteurjanelia --no-cache .
test/predict_and_retrain.sh
cat <<EOF
Next steps are:
docker login
docker push flaur/larvatagger:$RELEASE-standalone
docker push flaur/larvatagger:$RELEASE-20230311
docker push flaur/larvatagger:$RELEASE-bigfat
docker push flaur/larvatagger:latest
EOF
#!/usr/bin/env bash
for flag in "$@"; do
if [ "$flag" = "-h" -o "$flag" = "--help" ]; then
echo "Command-line installer for LarvaTagger"
echo
echo "Usage:"
echo " $0"
echo " WITH_BACKEND=1 $0"
echo " WITH_BACKEND=1 $0 --legacy"
echo " $0 --with-default-backend"
echo " $0 --with-backend --legacy"
echo " $0 --uninstall"
echo " $0 --help"
echo
echo "The legacy installation path for the MaggotUBA-based tagger relies"
echo "on Python3.8 and Torch1."
echo "The current default relies on Python3.11 and Torch2."
echo "Other environment variables can be set, similarly to WITH_BACKEND,"
echo "to control the script's behavior:"
echo " variable default value"
echo " BIN_DIR ~/.local/bin"
echo " LARVATAGGER_PATH ~/.local/share/larvatagger"
echo " JULIA_CHANNEL lts"
echo " JULIA_VERSION 1.10"
echo " PYTHON_VERSION 3.11 (3.8 if --legacy)"
echo " MAGGOTUBA_CORE_BRANCH main"
echo " MAGGOTUBA_ADAPTER_BRANCH torch2 (main if --legacy)"
echo " TAGGINGBACKENDS_BRANCH main"
echo " PLANARLARVAE_BRANCH main"
echo " LARVATAGGER_BRANCH main"
exit 0
fi
done
[ -d "`pwd`" ] || cd
if [ -z "$BIN_DIR" ]; then
......@@ -41,28 +74,47 @@ if [ "$1" = "--uninstall" ]; then
fi
else
# former default:
PYTHON_VERSION=3.8
# the internal_<VAR> variables need to be set non-empty only if
# the corresponding <VAR> variable is non-empty; they are used to
# determine whether or not to report externally sourced variables
internal_WITH_BACKEND=
internal_MAGGOTUBA_ADAPTER_BRANCH=
internal_MAGGOTUBA_ADAPTER_FREE_DEPENDENCIES=
internal_LEGACY=
internal_WITH_DEFAULT_BACKEND=
# primary use cases:
# WITH_BACKEND=1 scripts/install.sh
# WITH_BACKEND=1 scripts/install.sh --legacy
# scripts/install.sh --with-default-backend
# scripts/install.sh --with-backend
# scripts/install.sh --with-backend --experimental
# scripts/install.sh --with-backend --legacy
for arg in "$@"; do
if [ "$arg" = "--with-default-backend" ]; then
WITH_BACKEND=1
internal_WITH_BACKEND=1
MAGGOTUBA_CORE_BRANCH=
MAGGOTUBA_ADAPTER_BRANCH=
internal_WITH_DEFAULT_BACKEND=1
if [ "$internal_LEGACY" = "1" ]; then
echo "Ignoring --legacy; pass --with-backend --legacy instead"
internal_LEGACY=0
fi
break
elif [ "$arg" = "--with-backend" ]; then
WITH_BACKEND=1
internal_WITH_BACKEND=1
elif [ "$arg" = "--experimental" ]; then
echo "The --experimental flag is deprecated and is now default"
internal_LEGACY=0
elif [ "$arg" = "--legacy" ]; then
internal_LEGACY=1
MAGGOTUBA_CORE_BRANCH=
MAGGOTUBA_ADAPTER_BRANCH=torch2
internal_MAGGOTUBA_ADAPTER_BRANCH=1
PYTHON_VERSION=3.11
# former default (empty) falls back to main
MAGGOTUBA_ADAPTER_BRANCH=
elif [ "$arg" = "--free-python-dependencies" ]; then
internal_MAGGOTUBA_ADAPTER_FREE_DEPENDENCIES=1
elif [ "$arg" = "--lock-python-dependencies" ]; then
......@@ -70,6 +122,23 @@ for arg in "$@"; do
fi
done
if [ "$WITH_BACKEND" = "1" -a "$internal_LEGACY" = "0" ]; then
# new defaults (formerly "--experimental")
MAGGOTUBA_CORE_BRANCH=
MAGGOTUBA_ADAPTER_BRANCH=torch2
internal_MAGGOTUBA_ADAPTER_BRANCH=1
PYTHON_VERSION=3.11
elif [ "$internal_LEGACY" != "1" ]; then
if [ "$internal_WITH_DEFAULT_BACKEND" = "1" ]; then
echo "The Python3.11/Torch2 backend is now default"
echo "Pass flags --with-backend --legacy to select the former Python3.8/Torch1 backend"
fi
MAGGOTUBA_CORE_BRANCH=
MAGGOTUBA_ADAPTER_BRANCH=torch2
internal_MAGGOTUBA_ADAPTER_BRANCH=1
PYTHON_VERSION=3.11
fi
PYTHON="python$PYTHON_VERSION"
check_brew() {
......@@ -111,7 +180,7 @@ else
if [ -z "$JULIA_VERSION" ]; then
JULIA_VERSION=1.10
JULIA_CHANNEL=lts
JULIA_CHANNEL=1.10
else
echo "Using environment variable: JULIA_VERSION= $JULIA_VERSION"
if [ -z "$JULIA_CHANNEL" ]; then
......@@ -266,7 +335,7 @@ if [ -d LarvaTagger.jl ]; then
echo "LarvaTagger.jl installation found; skipping"
else
if [ -z "$LARVATAGGER_BRANCH" ]; then
LARVATAGGER_BRANCH=dev
LARVATAGGER_BRANCH=main
else
echo "Using environment variable: LARVATAGGER_BRANCH= $LARVATAGGER_BRANCH"
fi
......@@ -300,7 +369,7 @@ else
activate() {
# pyenv activation is necessary on WSL
command -v pyenv &>/dev/null && pyenv local $PYTHON_VERSION
command -v pyenv &>/dev/null && [ -n "`pyenv versions | grep ' $PYTHON_VERSION'`" ] && pyenv local $PYTHON_VERSION
poetry env use $PYTHON_VERSION
}
......
......@@ -2,7 +2,7 @@
for _ in $(seq $#); do
case $1 in
open|import|merge|train|predict|-V|--version|--more-help|reverse-mapping|confusion)
open|import|merge|train|predict|-V|--version|--more-help|reverse-mapping|confusion|backend)
cmd=$1
shift
break
......@@ -102,6 +102,15 @@ done
if [ -z "$no_cache" ] && [ -z "$cache" ]; then
DOCKER_ARGS="--no-cache $DOCKER_ARGS"
fi
if [ "$BUILD" == "--dev" ]; then
if ! [[ "$LARVATAGGER_IMAGE" == *:* ]]; then LARVATAGGER_IMAGE="${LARVATAGGER_IMAGE}:dev"; fi
PROJECT_ROOT=$(basename $(pwd))
cd ..
echo "DOCKER_BUILDKIT=1 $docker build -t \"$LARVATAGGER_IMAGE\" -f \"$PROJECT_ROOT/recipes/Dockerfile.local\" ${DOCKER_ARGS}."
DOCKER_BUILDKIT=1 $docker build -t "$LARVATAGGER_IMAGE" -f "$PROJECT_ROOT/recipes/Dockerfile.local" ${DOCKER_ARGS}.
else
if [ -z "$target" ]; then
DOCKER_ARGS="--target $TARGET $DOCKER_ARGS"
fi
......@@ -111,19 +120,16 @@ else
echo "Using environment variable DOCKERFILE= $DOCKERFILE"
fi
DOCKER_ARGS="-f \"$DOCKERFILE\" $DOCKER_ARGS"
if [ "$BUILD" == "--dev" ]; then
if ! [[ "$LARVATAGGER_IMAGE" == *:* ]]; then LARVATAGGER_IMAGE="${LARVATAGGER_IMAGE}:dev"; fi
PROJECT_ROOT=$(basename $(pwd))
cd ..
DOCKER_BUILDKIT=1 $docker build -t "$LARVATAGGER_IMAGE" -f "$PROJECT_ROOT/recipes/Dockerfile.local" ${DOCKER_ARGS}.
elif [ "$BUILD" == "--stable" ]; then
if [ "$BUILD" == "--stable" ]; then
if ! [[ "$LARVATAGGER_IMAGE" == *:* ]]; then LARVATAGGER_IMAGE="${LARVATAGGER_IMAGE}:stable"; fi
$docker build -t "$LARVATAGGER_IMAGE" ${DOCKER_ARGS}.
else
if ! [[ "$LARVATAGGER_IMAGE" == *:* ]]; then LARVATAGGER_IMAGE="${LARVATAGGER_IMAGE}:latest"; fi
if [ -z "$LARVATAGGER_BRANCH" ]; then
if [ -z "$LARVATAGGER_DEFAULT_BRANCH" ]; then
LARVATAGGER_BRANCH=dev;
LARVATAGGER_BRANCH=dev
else
echo "Deprecation notice: LARVATAGGER_DEFAULT_BRANCH has been renamed LARVATAGGER_BRANCH"
LARVATAGGER_BRANCH=$LARVATAGGER_DEFAULT_BRANCH
......@@ -141,6 +147,7 @@ DOCKER_BUILD="$docker build -t "$LARVATAGGER_IMAGE" ${DOCKER_ARGS}--build-arg BR
echo $DOCKER_BUILD
eval $DOCKER_BUILD
fi
fi
;;
open)
......@@ -174,6 +181,34 @@ done
DOCKER_RUN="exec $docker run $RUN_ARGS -i ${DOCKER_ARGS}\"$LARVATAGGER_IMAGE\" open \"/data/$file\" $TAGGER_ARGS $@"
echo $DOCKER_RUN
eval $DOCKER_RUN
;;
backend)
if [ -z "$LARVATAGGER_PORT" ]; then
LARVATAGGER_PORT=9285
elif [ "$LARVATAGGER_PORT" != "9285" ]; then
echo "Using environment variable: LARVATAGGER_PORT= $LARVATAGGER_PORT"
fi
DOCKER_ARGS="-p $LARVATAGGER_PORT:$LARVATAGGER_PORT $DOCKER_ARGS"
# undocumented feature (copied-pasted from open)
backend=MaggotUBA
while [ -n "$1" -a "$1" = "--external-instance" ]; do
instance=$2; shift 2
if ! command -v realpath &>/dev/null; then
echo "realpath: command not found"
echo "on macOS: brew install coreutils"
exit 1
fi
RUN_ARGS="$RUN_ARGS --mount type=bind,src=\"$(realpath $instance)\",dst=/app/$backend/models/$(basename $instance)"
done
RUN_ARGS="$RUN_ARGS --entrypoint=julia"
DOCKER_RUN="exec $docker run $RUN_ARGS -i ${DOCKER_ARGS}\"$LARVATAGGER_IMAGE\" $@ --project=/app -e 'using LarvaTagger.REST.Server; run_backend(\"/app\"; port=$LARVATAGGER_PORT, host=\"0.0.0.0\")'"
echo $DOCKER_RUN
eval $DOCKER_RUN
;;
import | merge)
......@@ -371,6 +406,7 @@ Usage: $0 build [--stable] [--with-default-backend] [--with-backend <backend>]
$0 confusion <datarepository>
$0 merge <filepath> [<outputfilename>] [...]
$0 reverse-mapping <filepath> <filename> <outputfilename>
$0 backend
$0 --more-help
$0 --version
$0 --update
......@@ -390,6 +426,8 @@ the second one with unmapped labels. It generates a third label file with demapp
from the first file. This is useful when the first file diverges from the second one by some
manual editions, on top of label mapping.
The backend command runs a LarvaTagger's REST server that listens to port 9285 per default.
See --more-help for more information about additional arguments for the other commands from
larvatagger.jl.
EOT
......
......@@ -32,6 +32,8 @@ include("players.jl")
include("controllers.jl")
include("Taggers.jl")
using .Taggers
include("REST/REST.jl")
using .REST
include("files.jl")
include("backends.jl")
include("cli_base.jl")
......
module Client
using PlanarLarvae.Datasets
import ..Taggers: Taggers, Tagger
import HTTP: HTTP
using JSON3
using OrderedCollections: OrderedDict
using Observables
export RemoteTagger, LTBackend, connect, listmodels, active_model_instance
mutable struct RemoteTagger
endpoint::AbstractString
token::AbstractString # empty if not connected
backend::AbstractString
model_instance::AbstractString
output_filenames::Dict{String, String} # renaming rules
end
function RemoteTagger(endpoint, backend, model_instance)
RemoteTagger(endpoint, "", backend, model_instance, Dict{String, String}())
end
connected(tagger) = !isempty(tagger.token)
function connect!(tagger)
tagger.token = gettoken(tagger)
return tagger
end
function gettoken(tagger)
if isempty(tagger.token)
resp = HTTP.get("$(tagger.endpoint)/get-token/$(tagger.backend)/$(tagger.model_instance)")
transcode(String, resp.body)
else
tagger.token
end
end
function url(tagger::RemoteTagger, switch)
token = tagger.token
@assert !isnothing(token)
return "$(tagger.endpoint)/$switch/$(tagger.backend)/$(tagger.model_instance)/$token"
end
function Base.close(tagger::RemoteTagger)
HTTP.get(url(tagger, "close"))
tagger.token = ""
end
function listfiles(tagger::RemoteTagger, srcdir::String)
resp = HTTP.get("$(url(tagger, "list-files"))/$srcdir")
resp = transcode(String, resp.body)
return JSON3.read(resp)
end
function Taggers.pull(tagger::RemoteTagger, destdir::String)
cmd = url(tagger, "pull-file")
# destdir can be empty on macOS
destdir = isempty(destdir) ? pwd() : realpath(destdir) # strip end slash
destfiles = String[]
for filename in listfiles(tagger, "processed")
resp = HTTP.get("$cmd/$filename")
destfile = joinpath(destdir, filename)
open(destfile, "w") do f
write(f, resp.body)
end
push!(destfiles, destfile)
end
return destfiles
end
function Taggers.resetdata(tagger::RemoteTagger, dir=nothing)
query = url(tagger, "reset-data")
if !isnothing(dir)
query = "$query/$dir"
end
HTTP.get(query)
end
function Taggers.pushfile(tagger::RemoteTagger, src, dst)
@assert dst == basename(src)
Taggers.pushfile(tagger, src)
end
function Taggers.pushfile(tagger::RemoteTagger, src)
request = url(tagger, "push-file")
filename = basename(src)
content = read(src, String)
# emulate curl's behavior
boundary = "------------------------8vna2TYHERnQGhqhQLdlHq" # whatever
headers = [("Content-Type" => "multipart/form-data; boundary=$boundary")]
body = "$boundary\r\nContent-Disposition: form-data; filename=\"$filename\"\r\nContent-Type: application/octet-steam\r\n\r\n$content\r\n$boundary--\r\n"
HTTP.post(request, headers, body)
end
function Taggers.push(tagger::RemoteTagger, file::String, metadata; clean=true)
clean ? Taggers.resetdata(tagger) : Taggers.resetdata(tagger, "raw")
Taggers.push(tagger, file)
if !isnothing(metadata)
mktempdir() do dir
metadatafile = joinpath(dir, "metadata")
Datasets.to_json_file(metadatafile, metadata)
Taggers.pushfile(tagger, metadatafile)
end
end
end
Taggers.run(tagger::RemoteTagger, switch) = HTTP.get(url(tagger, switch); retry=false)
Taggers.predict(tagger::RemoteTagger) = Taggers.run(tagger, "predict")
Taggers.embed(tagger::RemoteTagger) = Taggers.run(tagger, "embed")
struct LTBackend
endpoint
metadata
taggers
active_tagging_backend
active_tagger
end
function LTBackend(endpoint)
metadata = Dict{String, OrderedDict{Symbol, Union{String, Dict{String, OrderedDict{Symbol, String}}}}}()
taggers = OrderedDict{String, OrderedDict{String, RemoteTagger}}()
active_tagging_backend = Observable{Union{Nothing, String}}(nothing)
active_tagger = Observable{Union{Nothing, RemoteTagger}}(nothing)
on(active_tagging_backend) do tagging_backend
if isnothing(tagging_backend)
active_tagger[] = nothing
elseif isnothing(active_tagger[]) || active_tagger[].backend != tagging_backend
active_tagger[] = first(values(taggers[tagging_backend]))
else
notify(active_tagger)
end
end
on(active_tagger) do tagger
if !isnothing(tagger)
@info "Tagger selected" backend=tagger.backend instance=tagger.model_instance
if tagger.backend != active_tagging_backend[]
active_tagging_backend[] = tagger.backend
end
end
end
LTBackend(endpoint, metadata, taggers, active_tagging_backend, active_tagger)
end
function active_model_instance(back::LTBackend)
obs = Observable{Union{Nothing, String}}(nothing)
active_tagger = back.active_tagger
on(active_tagger) do tagger
obs[] = isnothing(tagger) ? nothing : tagger.model_instance
end
on(obs) do model_instance
tagger = active_tagger[]
if model_instance != tagger.model_instance
active_tagger[] = back.taggers[tagger.backend][model_instance]
end
end
return obs
end
function connect(back::LTBackend; refresh_rate=0.5, preselect_tagger=false)
while true
resp = HTTP.get("$(back.endpoint)/status")
resp = transcode(String, resp.body)
if resp == "up"
break
else
sleep(refresh_rate)
end
end
listtaggers(back)
#@info "listtaggers" back.taggers back.metadata
if preselect_tagger
back.active_tagging_backend[] = first(collect(keys(back.taggers)))
end
end
function simpleconvert(json::JSON3.Object)
OrderedDict(key => simpleconvert(val) for (key, val) in pairs(json))
end
simpleconvert(json::JSON3.Array) = simpleconvert.(json)
simpleconvert(val) = val
function listtaggers(back::LTBackend)
endpoint = back.endpoint
resp = HTTP.get("$(endpoint)/list-taggers")
resp = transcode(String, resp.body)
json = JSON3.read(resp)
@assert json isa JSON3.Array
taggers = simpleconvert(json)
for tagger in taggers
tagging_backend = tagger[:name]
back.metadata[tagging_backend] = OrderedDict(
key => (key === :models ? Dict(model[:name] => OrderedDict(
key′=> val′ for (key′, val′) in pairs(model) if key′!== :name)
for model in val) : val)
for (key, val) in pairs(tagger) if key !== :name)
back.taggers[tagging_backend] = OrderedDict(
model[:name] => RemoteTagger(endpoint, tagging_backend, model[:name])
for model in tagger[:models])
end
return taggers
end
listmodels(back::LTBackend) = listmodels(back, Val(false))
function listmodels(back::LTBackend, ::Val{false})
[OrderedDict("name" => name,
"description" => get(back.metadata[name], :description, ""),
"homepage" => get(back.metadata[name], :homepage, ""),
) for name in keys(back.taggers)]
end
function listmodels(back::LTBackend, ::Val{true})
map(back.active_tagging_backend) do tagging_backend
models = OrderedDict{String, String}[]
for name in keys(back.taggers[tagging_backend])
metadata = back.metadata[tagging_backend][:models][name]
push!(models, OrderedDict("name" => name,
"description" => get(metadata, :description, ""),
"homepage" => get(metadata, :homepage, ""),
))
end
return models
end
end
function Taggers.predict(back::LTBackend, file::String; metadata=nothing)
tagger = back.active_tagger[]
isnothing(tagger) && throw("no active tagger")
connected(tagger) || connect!(tagger)
Taggers.push(tagger, file, metadata)
Taggers.predict(tagger)
outputfiles = Taggers.pull(tagger, dirname(file))
@assert !isempty(outputfiles)
length(outputfiles) == 1 || @warn "Multiple output files" outputfiles
return outputfiles[1]
end
end
module Model
import ..Taggers: Taggers, Tagger, loadmetadata, apply_make_dataset
import HTTP: HTTP
import JSON3
using OrderedCollections: OrderedDict
export LTBackend, gettoken, resetdata, listfiles, pushfile, pullfile, listtaggers, predict,
embed
# pure part of the Server module (no global state)
struct LTBackend
root
tokens
lock
end
function LTBackend()
root = Ref{AbstractString}("")
tokens = Dict{String, Dict{String, Dict{String, Float64}}}()
lock = ReentrantLock()
LTBackend(root, tokens, lock)
end
Base.lock(f::Function, backend::LTBackend) = lock(f, backend.lock)
Base.isready(backend::LTBackend) = isdir(backend.root[])
get!(dict::AbstractDict{K, V}, key::K) where {K, V} = Base.get!(dict, key, V())
function get!(dict::AbstractDict{K, D}, key1::K, key2, keys...) where {K, D}
get!(get!(dict, key1), key2, keys...)
end
function gettagger(lt_backend, tagging_backend_dir, model_instance)
@assert isready(lt_backend)
tagging_backend_path = joinpath(lt_backend.root[], tagging_backend_dir)
@assert Taggers.isbackend(tagging_backend_path)
lock(lt_backend) do
tagger = Taggers.isolate(Tagger(tagging_backend_path, model_instance))
token = tagger.sandbox
tokens = get!(lt_backend.tokens, tagging_backend_dir, model_instance)
@assert token keys(tokens)
tokens[token] = time()
return tagger
end
end
function gettagger(lt_backend, tagging_backend_dir, model_instance, token)
@assert isready(lt_backend)
lock(lt_backend) do
tokens = lt_backend.tokens
@assert tagging_backend_dir in keys(tokens)
tokens = lt_backend.tokens[tagging_backend_dir]
@assert model_instance in keys(tokens)
tokens = tokens[model_instance]
@assert token in keys(tokens)
end
tagging_backend_path = joinpath(lt_backend.root[], tagging_backend_dir)
tagger = Tagger(tagging_backend_path, model_instance, token)
return tagger
end
##
function gettoken(lt_backend, backend_dir, model_instance)
tagger = gettagger(lt_backend, backend_dir, model_instance)
return tagger.sandbox
end
function Base.close(lt_backend::LTBackend, backend_dir, model_instance, token)
tagger = gettagger(lt_backend, backend_dir, model_instance, token)
Taggers.removedata(tagger)
lock(lt_backend) do
pop!(lt_backend.tokens[backend_dir][model_instance], token)
end
nothing
end
function resetdata(lt_backend, backend_dir, model_instance, token, datadir=nothing)
tagger = gettagger(lt_backend, backend_dir, model_instance, token)
if isnothing(datadir)
Taggers.resetdata(tagger)
else
Taggers.resetdata(tagger, datadir)
end
nothing
end
function listfiles(lt_backend, backend_dir, model_instance, token, data_dir)
tagger = gettagger(lt_backend, backend_dir, model_instance, token)
dir = Taggers.datadir(tagger, data_dir)
ls = []
for (parent, _, files) in walkdir(dir; follow_symlinks=true)
if parent == dir
append!(ls, files)
else
parent = relpath(parent, dir)
for file in files
push!(ls, joinpath(parent, file))
end
end
end
isempty(ls) ? "[]" : "[\"" * join(ls, "\", \"") * "\"]"
end
"""
pushfile(::LTBackend, ::HTTP.Request, backend_dir, model_instance, token)
Handle `push-file` queries, *i.e.* receive file.
"""
function pushfile(lt_backend, request, backend_dir, model_instance, token)
tagger = gettagger(lt_backend, backend_dir, model_instance, token)
@assert request.method == "POST"
body = request.body
@assert body isa Vector{UInt8}
k = findfirst(c -> c in (0x0d, 0x0a), body) - 1
filesep = body[1:k]
linesep = if body[k+2] == 0x0a
@assert body[k+1] == 0x0d
body[k+1:k+2]
else
body[k+1]
end
filesep = vcat(linesep, filesep)
n = length(filesep)
dk = length(linesep)
bodies = Vector{UInt8}[]
while k < length(body)
i, j = 1, k + dk + 1
for outer k in j:length(body)
if body[k] == filesep[i]
i += 1
if n < i
push!(bodies, body[j:k-n])
break
end
else
i = 1
end
end
end
linesep = transcode(String, linesep)
for body in bodies
k = HTTP.Parsers.find_end_of_header(body)
@assert 0 < k
rawheader = transcode(String, body[1:k-2dk])
content = body[k+1:end]
@debug "push-file" rawheader
header = Dict{Symbol, AbstractString}()
for line in split(rawheader, linesep)
if isempty(header)
for pair in split(line, "; ")
parts = split(pair, '"')
if length(parts) == 1
key, val = split(pair, ": ")
header[Symbol(key)] = val
else
key = parts[1]
@assert endswith(key, '=')
key = Symbol(key[1:end-1])
val = join(parts[2:end-1], '"')
header[key] = val
end
end
else
@assert startswith(line, "Content-Type: ")
key, val = split(line, ": ")
header[Symbol(key)] = val
end
end
@info "push-file" header length(content)
filename = header[:filename]
dest = joinpath(Taggers.datadir(tagger, "raw"), filename)
open(dest, "w") do f
write(f, content)
end
end
end
function pullfile(lt_backend, backend_dir, model_instance, token, filename)
tagger = gettagger(lt_backend, backend_dir, model_instance, token)
src = joinpath(Taggers.datadir(tagger, "processed"), filename)
header = Dict("Content-Disposition" => "form-data")
body = open(src)
return HTTP.Response(200, header, body)
end
function listtaggers(lt_backend)
inventory = Vector{OrderedDict{String, Any}}()
backends_dir = lt_backend.root[]
for tagging_backend_path in readdir(backends_dir; join=true)
Taggers.isbackend(tagging_backend_path) || continue
models_dir = joinpath(tagging_backend_path, "models")
models = loadmetadata.(readdir(models_dir; join=true))
isempty(models) && continue
tagging_backend = loadmetadata(tagging_backend_path, false)
tagging_backend["models"] = models
push!(inventory, tagging_backend)
end
return JSON3.write(unique(inventory))
end
function predict(lt_backend, backend_dir, model_instance, token)
tagger = gettagger(lt_backend, backend_dir, model_instance, token)
make_dataset = apply_make_dataset(tagger, "predict")
# blocking; should we run async and expose a token-specific status api call?
Taggers.predict(tagger; make_dataset=make_dataset)
end
function embed(lt_backend, backend_dir, model_instance, token)
tagger = gettagger(lt_backend, backend_dir, model_instance, token)
make_dataset = apply_make_dataset(tagger, "embed")
# blocking, like predict
Taggers.embed(tagger; make_dataset=make_dataset)
end
end
module REST
using ..Taggers
include("Server.jl")
include("Client.jl")
end
module Server
using Oxygen; @oxidise
import ..Taggers
include("Model.jl")
using .Model
export run_backend
function run_backend(backend::LTBackend; async=false, port=9285, kwargs...)
@assert isready(backend)
serve(; async=async, port=port, kwargs...)
end
# the Oxygen module has global state; as a consequence, the server must also have global
# state
const lt_backend = LTBackend()
function run_backend(root::AbstractString; kwargs...)
lt_backend.root[] = root
run_backend(; kwargs...)
end
run_backend(; kwargs...) = run_backend(lt_backend; kwargs...)
@get "/status" function(request)
return "up"
end
@get "/get-token/{backend_dir}/{model_instance}" function(
request,
backend_dir::String,
model_instance::String,
)
gettoken(lt_backend, backend_dir, model_instance)
end
@get "/close/{backend_dir}/{model_instance}/{token}" function(
request,
backend_dir::String,
model_instance::String,
token::String,
)
close(lt_backend, backend_dir, model_instance, token)
end
@get "/reset-data/{backend_dir}/{model_instance}/{token}" function(
request,
backend_dir::String,
model_instance::String,
token::String,
)
resetdata(lt_backend, backend_dir, model_instance, token)
end
@get "/reset-data/{backend_dir}/{model_instance}/{token}/{data_dir}" function(
request,
backend_dir::String,
model_instance::String,
token::String,
data_dir::String,
)
resetdata(lt_backend, backend_dir, model_instance, token, data_dir)
end
@get "/list-files/{backend_dir}/{model_instance}/{token}/{data_dir}" function(
request,
backend_dir::String,
model_instance::String,
token::String,
data_dir::String,
)
listfiles(lt_backend, backend_dir, model_instance, token, data_dir)
end
@post "/push-file/{backend_dir}/{model_instance}/{token}" function(
request,
backend_dir::String,
model_instance::String,
token::String,
)
pushfile(lt_backend, request, backend_dir, model_instance, token)
end
@get "/pull-file/{backend_dir}/{model_instance}/{token}/{filename}" function(
request,
backend_dir::String,
model_instance::String,
token::String,
filename::String,
)
pullfile(lt_backend, backend_dir, model_instance, token, filename)
end
@get "/list-taggers" function(request)
listtaggers(lt_backend)
end
@get "/predict/{backend_dir}/{model_instance}/{token}" function(
request,
backend_dir::String,
model_instance::String,
token::String,
)
predict(lt_backend, backend_dir, model_instance, token)
end
@get "/embed/{backend_dir}/{model_instance}/{token}" function(
request,
backend_dir::String,
model_instance::String,
token::String,
)
embed(lt_backend, backend_dir, model_instance, token)
end
end
module Taggers
import PlanarLarvae.Formats, PlanarLarvae.Dataloaders
using OrderedCollections: OrderedDict
using JSON3
export Tagger, isbackend, resetmodel, resetdata, train, predict, finetune, embed
export Tagger, isbackend, resetmodel, resetdata, train, predict, finetune, embed,
loadmetadata, apply_make_dataset
struct Tagger
backend_dir::String
......@@ -11,14 +14,15 @@ struct Tagger
output_filenames::Dict{String, String}
end
function Tagger(backend_dir::String, model_instance::String)
Tagger(backend_dir, model_instance, nothing, Dict{String, String}())
function Tagger(backend_dir::String, model_instance::String,
sandbox::Union{Nothing, String}=nothing)
Tagger(backend_dir, model_instance, sandbox, Dict{String, String}())
end
Tagger(backend_dir, model_instance) = Tagger(string(backend_dir), string(model_instance))
function isolate(tagger)
rawdatadir = joinpath(tagger.backend_dir, "data", "raw")
mkdir(rawdatadir)
isdir(rawdatadir) || mkpath(rawdatadir)
rawdatadir = mktempdir(rawdatadir; cleanup=false)
Tagger(tagger.backend_dir, tagger.model_instance, basename(rawdatadir),
tagger.output_filenames)
......@@ -50,15 +54,15 @@ tagging_backend_command(tagger::Tagger) = tagging_backend_command(tagger.backend
modeldir(tagger::Tagger) = joinpath(tagger.backend_dir, "models", tagger.model_instance)
datadir(tagger::Tagger, stage::String) = joinpath(tagger.backend_dir, "data", stage,
datadir(tagger::Tagger, stage) = joinpath(tagger.backend_dir, "data", stage,
something(tagger.sandbox, tagger.model_instance))
function reset(tagger::Tagger)
function reset(tagger)
resetmodel(tagger)
resetdata(tagger)
end
function reset(dir::String)
function reset(dir::AbstractString)
try
rm(dir; recursive=true)
catch
......@@ -69,13 +73,13 @@ end
resetmodel(tagger::Tagger) = reset(modeldir(tagger))
function resetdata(tagger::Tagger)
function resetdata(tagger)
for dir in ("raw", "interim", "processed")
resetdata(tagger, dir)
end
end
resetdata(tagger::Tagger, dir::String) = reset(datadir(tagger, dir))
resetdata(tagger::Tagger, dir) = reset(datadir(tagger, dir))
function removedata(tagger::Tagger)
for dir in ("raw", "interim", "processed")
......@@ -166,6 +170,72 @@ function push(tagger::Tagger, inputdata::String)
return destination
end
## new implementation for push
function pushfile(tagger, src, dst)
backend_name = basename(realpath(tagger.backend_dir))
@info "Pushing file to backend" backend=backend_name instance=tagger.model_instance src dst
src = normpath(src)
dst = normpath(joinpath(datadir(tagger, "raw"), dst))
if dst != src
dstdir = dirname(dst)
mkpath(dstdir)
open(src, "r") do f
open(dst, "w") do g
write(g, read(f))
end
end
end
return dst
end
pushfile(tagger, src) = pushfile(tagger, src, basename(src))
function pushdir(tagger, src, dst=nothing)
raw = datadir(tagger, "raw")
dst = isnothing(dst) ? raw : joinpath(raw, dst)
symlink(src, dst)
return dst
end
function push(tagger, inputdata::AbstractString)
destination = nothing
if occursin('*', inputdata)
repository = Dataloaders.Repository(inputdata)
for file in Formats.find_associated_files(Dataloaders.files(repository))
srcfile = file.source
dstfile = relpath(srcfile, repository.root)
pushfile(tagger, srcfile, dstfile)
end
elseif isdir(inputdata)
srcdir = realpath(inputdata) # strip the end slashes
resetdata(tagger, "raw")
pushdir(tagger, srcdir)
elseif endswith(inputdata, ".txt")
files_by_dir = Dict{String, Vector{String}}()
for file in readlines(inputdata)
parent = dirname(file)
push!(get!(files_by_dir, parent, String[]), abspath(file))
end
for (dir, files) in pairs(files_by_dir)
for file in Formats.find_associated_files(files)
srcfile = file.source
dstfile = joinpath(dir, basename(srcfile))
pushfile(tagger, srcfile, dstfile)
end
end
else
for file in Formats.find_associated_files(abspath(inputdata))
srcfile = file.source
dstfile = pushfile(tagger, srcfile)
if isnothing(destination)
destination = dstfile
end
end
end
return destination
end
function pull(tagger::Tagger, dest_dir::String)
proc_data_dir = datadir(tagger, "processed")
isdir(proc_data_dir) || throw("no processed data directory found")
......@@ -233,7 +303,9 @@ function run(tagger, switch, kwargs)
args = Any[]
parsekwargs!(args, kwargs)
cmd = tagging_backend_command(tagger)
Base.run(Cmd(`$cmd $switch $args`; dir=tagger.backend_dir))
cmd = Cmd(`$cmd $switch $args`; dir=tagger.backend_dir)
@info "Running command" cmd
Base.run(cmd)
end
function train(tagger::Tagger; pretrained_instance=None, kwargs...)
......@@ -256,4 +328,56 @@ end
embed(tagger::Tagger; kwargs...) = run(tagger, "embed", kwargs)
function loadmetadata(dir, instance=true)
metadata = nothing
for filename in ("metadata", "metadata.json")
if isfile(joinpath(dir, filename))
metadata = JSON3.read(joinpath(dir, filename))
break
end
end
name = basename(dir)
T = if instance
AbstractString
else
Union{AbstractString, Vector{OrderedDict{AbstractString, AbstractString}}}
end
model = OrderedDict{AbstractString, T}(
"name" => name,
"description" => "",
"homepage" => "",
"make_dataset" => "",
)
if !isnothing(metadata)
for key in keys(model)
key′ = Symbol(key)
if haskey(metadata, key′)
model[key] = metadata[key′]
end
end
end
return model
end
function loadmetadata(tagger::Tagger, instance=true)
if instance
loadmetadata(Taggers.modeldir(tagger.backend_dir))
else
loadmetadata(tagger.backend_dir, true)
end
end
function apply_make_dataset(tagger::Tagger, step)
# see recipes/patch.sh for a note about the make_dataset entry
@assert step in ("train", "finetune", "predict", "embed")
metadata = loadmetadata(tagger, false)
make_dataset = metadata["make_dataset"]
apply_make_dataset = step in ("train", "finetune")
if !isempty(make_dataset)
apply_make_dataset = make_dataset == "always" || occursin(step, make_dataset)
@debug "apply_make_dataset" metadata apply_make_dataset
end
return apply_make_dataset
end
end # module
......@@ -46,26 +46,40 @@ Backends(controller, location) = Backends(controller, string(location))
function getbackends(controller, location=nothing)
controller = gethub(controller)
try
return controller[:backends]
catch
backends = Backends(controller, location)
Observables.notify(backends.active_backend)
controller[:backends] = backends
return backends
if haskey(controller, :backends)
controller[:backends]
else
if !isnothing(location) && startswith(location, "http://")
back = REST.Client.LTBackend(location)
REST.Client.connect(back; preselect_tagger=true)
controller[:backends] = back
else
backends = Backends(controller, location)
Observables.notify(backends.active_backend)
controller[:backends] = backends
end
end
end
function Taggers.push(model::Backends, file::String; clean=true, metadata=true)
get_active_backend(backends::Backends) = backends.active_backend
get_model_instances(backends::Backends) = backends.model_instances
get_model_instance(backends::Backends) = backends.model_instance
get_backend_names(backends::Backends) = backends.backends
get_active_backend(back::REST.Client.LTBackend) = back.active_tagging_backend
get_model_instances(back::REST.Client.LTBackend) = REST.Client.listmodels(back, Val(true))
get_model_instance(back::REST.Client.LTBackend) = REST.Client.active_model_instance(back)
get_backend_names(back::REST.Client.LTBackend) = REST.Client.listmodels(back)
function Taggers.push(model::Backends, file::String; clean=true, metadata=nothing)
tagger = gettagger(model)
clean ? resetdata(tagger) : resetdata(tagger, "raw")
dest_file = Taggers.push(tagger, file)
if metadata
if !isnothing(metadata)
# save the metadata to file, so that the backend can reproduce them in the output
# file for predicted labels
dest_file = joinpath(dirname(dest_file), "metadata")
metadata = Observables.to_value(getmetadatatable(model.controller))
PlanarLarvae.Datasets.to_json_file(dest_file, asdict(metadata))
PlanarLarvae.Datasets.to_json_file(dest_file, metadata)
end
end
......@@ -76,25 +90,32 @@ function Taggers.pull(model::Backends, destdir::String)
return Taggers.pull(tagger, destdir)
end
function Taggers.predict(model::Backends)
function Taggers.predict(model::Backends, file::String; metadata=nothing)
isnothing(model.model_instance[]) && throw("no model selected")
backend_dir = joinpath(model.location, model.active_backend[])
model_instance = model.model_instance[]
isnothing(model_instance) && throw("no model instance selected")
turn_load_animation_on(model.controller)
tagger = Tagger(backend_dir, model_instance)
#
Taggers.push(model, file; metadata=metadata)
make_dataset = apply_make_dataset(tagger, "predict")
predict(tagger; make_dataset=make_dataset)
labelfile = Taggers.pull(model, dirname(file))
@assert length(labelfile) == 1
return labelfile[1]
end
function Taggers.predict(controller::ControllerHub, back)
inputfile = controller[:input][]
isnothing(inputfile) && throw("no loaded files")
@assert ispath(inputfile)
metadata = asdict(Observables.to_value(getmetadatatable(controller)))
turn_load_animation_on(controller)
try
# TODO: make the skip_make_dataset option discoverable in the backend
predict(Tagger(backend_dir, model_instance); skip_make_dataset=true)
resultingfile = predict(back, inputfile; metadata=metadata)
tryopenfile(controller, resultingfile; reload=true)
catch
turn_load_animation_off(model.controller)
turn_load_animation_off(controller)
rethrow()
end
end
function Taggers.predict(model::Backends, file::String)
Taggers.push(model, file)
predict(model)
labelfile = Taggers.pull(model, dirname(file))
@assert length(labelfile) == 1
tryopenfile(model.controller, labelfile[1]; reload=true)
end