Skip to content

`train` does not always drop the "edited" label

Ben Jones reported an inconsistency between --labels and --class-weights, with error message not as many weights as labels, similar to the following log output:

JULIA_PROJECT=$(realpath ../TaggingBackends) scripts/larvatagger.jl train ../MaggotUBA-adapter/ chore_sample_output/groundtruth.label test --labels roll,not_roll --class-weights 1,2
┌ Info: Pushing file to backend
│   backend = "MaggotUBA-adapter"
│   instance = "test"
└   file = "/home/francois/Projects/Nyx/LarvaTagger/chore_sample_output/groundtruth.label"
┌ Info: Pushing file to backend
│   backend = "MaggotUBA-adapter"
│   instance = "test"
└   file = "/home/francois/Projects/Nyx/LarvaTagger/chore_sample_output/20150701_105504@FCF_attP2_1500062@UAS_Chrimson_Venus_X_0070@t15@r_LED50_30s2x15s30s#n#n#n@100.outline"
┌ Info: Pushing file to backend
│   backend = "MaggotUBA-adapter"
│   instance = "test"
└   file = "/home/francois/Projects/Nyx/LarvaTagger/chore_sample_output/20150701_105504@FCF_attP2_1500062@UAS_Chrimson_Venus_X_0070@t15@r_LED50_30s2x15s30s#n#n#n@100.spine"
WARNING: could not import HDF5.exists into MAT
INFO:main: running make_dataset.py
[ Info: Counting behavior labels in run: test
┌ Info: Sample sizes (observed, selected):
│   roll = (338, 338)
└   not_roll = (110152, 6760)
┌ Info: Explicit inclusions based on label "edited":
└   roll = 338
[ Info: Sampling series of spines in run: test
INFO:main: running train_model.py
Traceback (most recent call last):
  File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/bin/tagging-backend", line 8, in <module>
    sys.exit(main())
  File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/main.py", line 200, in main
    backend._run_script(backend.train_model, trailing=unknown_args, **train_kwargs)
  File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/explorer.py", line 241, in _run_script
    raise Exception(f"in {path.name}:\n"+"\n".join(lines))
Exception: in train_model.py:
Traceback (most recent call last):
  File "/home/francois/Projects/Nyx/MaggotUBA-adapter/src/maggotuba/models/train_model.py", line 98, in <module>
    main(train_model)
  File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/main.py", line 237, in main
    return fun(backend, **args)
  File "/home/francois/Projects/Nyx/MaggotUBA-adapter/src/maggotuba/models/train_model.py", line 15, in train_model
    dataset = LarvaDataset(larva_dataset_file[0], new_generator(rng_seed),
  File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/data/dataset.py", line 28, in __init__
    self.class_weights = class_weights
  File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/data/dataset.py", line 252, in class_weights
    raise ValueError("not as many weights as labels")
ValueError: not as many weights as labels
ERROR: LoadError: failed process: Process(setenv(`poetry run tagging-backend train --model-instance test --labels roll,not_roll --balancing-strategy auto --include-all edited --class-weights 1,2`; dir="../MaggotUBA-adapter/"), ProcessExited(1)) [1]

This could be reproduced with an imported label file as training data.

Passing one extra value to --class-weights does not trigger the error, suggesting the --labels argument is somehow ignored at some point and the "edited" label appears again when weights are to be applied.