Generally speaking, less than 1% of the camera-trap data processed through TrapTagger contains multiple species. Therefore we make a powerful simplifying assumption – that clusters generally only have a single species, and that multiple-species clusters are an exception rather then the rule. This means that we can label data at the cluster level, rather than the individual sighting or bounding box level. For the AI, this results in much greater precision, allowing us to ignore spurious classification errors caused by false detections or partially-obscured animals. For humans, this means that you only need to look at a single image to label the entire cluster – resulting in orders of magnitude faster annotation.

Under this philosophy, we never used to differentiate which species each bounding box in a cluster was – all were simply labelled as all species in the cluster. This did result in a small over count of the species in a cluster – for example if there were 3 impala and 2 lions, the cluster would actually be counted as 5 animals under each label – however in the early days of the platform the species classifiers simply did not exist, or were not accurate enough to be relied upon for such precision. Therefore, this would always have to be manually corrected through the multi-species differentiation workflow if one wanted accurate results down to the bounding-box level.

This update seeks to improve both aspects of this process – the bounding-box level classifications, and the multi-species workflow.

Figure 1: The new and improved multi-species annotation interface.

Sighting/Bounding-Box Classification

Now that our species classifiers are more developed – and more widespread geographically – we can rely on them to generate much more accurate sighting or bounding-box level classifications. We still retain the same tried and tested cluster-level classification rules and processes as before. For most data, this means that clusters can still only be automatically classified as a single species, where any additional species need to be human verified. For waterholes, the same relaxed multi-species rules remain in place as covered in the associated update blog post. However, once a cluster has been identified as containing multiple species, the sighting or bounding-boxes in that cluster will now be assigned just a single species based on their AI classification.

This process is handled by a heuristic algorithm that essentially does the following:

  • If the classified species matches one of the cluster labels, then it is simply labelled as that species.
  • If the classified species does not match any of the cluster labels, then it is labelled as the most-similar species. For example, for a bounding box classified as a kudu (an antelope) in a lion and impala cluster, it will be labelled as an impala (another antelope).
  • If no similar species is found, then the box will be assigned the most-prevalent species in the cluster.

This will result in more-accurate automatic animal-count estimates, but as always one should always double check these in the multi-species annotation workflow if high precision is required. Additionally, the boxes themselves should also be corrected in the sighting (box) correction workflow for all species of interest to ensure accuracy.

Please note: This new heuristic will not be automatically applied to historic datasets so as to ensure data integrity. It will instead be applied to all new annotation sets as well as those that undergo any processing (launching for manual annotation etc.) from now onward.

Multi-Species Workflow

On top of the more-accurate automatic bounding-box labelling that should require you to change a lot fewer species labels in the first place, we have also made a lot of improvements to the multi-species annotation interface:

  • With one hotkey press or button click, you can label all animals in the image as one of the cluster species. Beyond the obvious use case where all the animals are one species, this also lets you set all the boxes to the majority species and then only select and change the few minority species.
  • If you find an additional species whilst going through the cluster, all your labels can be found under the other option.
  • You can now select multiple boxes by holding the ctrl button and clicking all the boxes you’d like to change, and then simply assign their species using these same species buttons and hotkeys.
  • You can skip clusters that are too large, too time consuming, or are unimportant – for example a cluster labelled as containing sheep, goat, herders, and dogs in a wildlife survey.
  • These skipped clusters can then be re-visited at a later stage if needs be.

This process now allows you to go through each individual image in a cluster and correct the individual box labels in a much more rapid and efficient manner. However, this can be skipped in cases where the counts of the animals in each cluster aren’t needed or do not need to be 100% accurate.