`train` does not always drop the "edited" label
Ben Jones reported an inconsistency between --labels and --class-weights, with error message not as many weights as labels, similar to the following log output:
JULIA_PROJECT=$(realpath ../TaggingBackends) scripts/larvatagger.jl train ../MaggotUBA-adapter/ chore_sample_output/groundtruth.label test --labels roll,not_roll --class-weights 1,2
┌ Info: Pushing file to backend
│ backend = "MaggotUBA-adapter"
│ instance = "test"
└ file = "/home/francois/Projects/Nyx/LarvaTagger/chore_sample_output/groundtruth.label"
┌ Info: Pushing file to backend
│ backend = "MaggotUBA-adapter"
│ instance = "test"
└ file = "/home/francois/Projects/Nyx/LarvaTagger/chore_sample_output/20150701_105504@FCF_attP2_1500062@UAS_Chrimson_Venus_X_0070@t15@r_LED50_30s2x15s30s#n#n#n@100.outline"
┌ Info: Pushing file to backend
│ backend = "MaggotUBA-adapter"
│ instance = "test"
└ file = "/home/francois/Projects/Nyx/LarvaTagger/chore_sample_output/20150701_105504@FCF_attP2_1500062@UAS_Chrimson_Venus_X_0070@t15@r_LED50_30s2x15s30s#n#n#n@100.spine"
WARNING: could not import HDF5.exists into MAT
INFO:main: running make_dataset.py
[ Info: Counting behavior labels in run: test
┌ Info: Sample sizes (observed, selected):
│ roll = (338, 338)
└ not_roll = (110152, 6760)
┌ Info: Explicit inclusions based on label "edited":
└ roll = 338
[ Info: Sampling series of spines in run: test
INFO:main: running train_model.py
Traceback (most recent call last):
File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/bin/tagging-backend", line 8, in <module>
sys.exit(main())
File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/main.py", line 200, in main
backend._run_script(backend.train_model, trailing=unknown_args, **train_kwargs)
File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/explorer.py", line 241, in _run_script
raise Exception(f"in {path.name}:\n"+"\n".join(lines))
Exception: in train_model.py:
Traceback (most recent call last):
File "/home/francois/Projects/Nyx/MaggotUBA-adapter/src/maggotuba/models/train_model.py", line 98, in <module>
main(train_model)
File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/main.py", line 237, in main
return fun(backend, **args)
File "/home/francois/Projects/Nyx/MaggotUBA-adapter/src/maggotuba/models/train_model.py", line 15, in train_model
dataset = LarvaDataset(larva_dataset_file[0], new_generator(rng_seed),
File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/data/dataset.py", line 28, in __init__
self.class_weights = class_weights
File "/home/francois/.cache/pypoetry/virtualenvs/maggotuba-adapter-z1At8Kzh-py3.8/lib/python3.8/site-packages/taggingbackends/data/dataset.py", line 252, in class_weights
raise ValueError("not as many weights as labels")
ValueError: not as many weights as labels
ERROR: LoadError: failed process: Process(setenv(`poetry run tagging-backend train --model-instance test --labels roll,not_roll --balancing-strategy auto --include-all edited --class-weights 1,2`; dir="../MaggotUBA-adapter/"), ProcessExited(1)) [1]
This could be reproduced with an imported label file as training data.
Passing one extra value to --class-weights does not trigger the error, suggesting the --labels argument is somehow ignored at some point and the "edited" label appears again when weights are to be applied.