In partnership with Global Wildlife Conservation logo

Data

Training Data Quality

A lot of challenges were encountered during labeling aerial images compared to, for example, images from camera traps - images from aerial surveys contain relatively small objects of interest, and backgrounds and lighting varies dramatically.

Aerial images were captured in the air, 90-110m (300 to 350 feet) above ground. An aerial image contains 6016 x 4000 pixels, though only a very small portion of the image actually contains objects of interest. We tiled each of these larger images into 150 chips (400 x 400 pixel per chip). 4 out of 150 chips (about 2.7%) have objects present. A lot of livestock appear in herds in the images, as do some wildlife species, e.g. elephants, buffaloes, wildebeest, and antelope. Without zooming in really closely to the objects, and without pre-existing knowledge of wildlife and livestock appearance and habitat, it’s very difficult to label the object correctly.

Below is an example image that was captured during the aerial survey. The objects appear at the bottom of the image near the plane shadow. There are cows and a human in the aerial image and the object sizes are all small. All the animals were labeled correctly as cows, but each cow object exhibits varying properties, including different: color, shading, and body angles. These issues can be challenging for computer vision/deep learning models to handle. Furthermore, the training labels of some of these aerial images, in total 6 image chips (in 400 x 400 pixels) highlight the quality issues we see through the rest of the training dataset and in prior iterations of the training dataset.

Missing labels

Missing labels can impact both the classifier and detector model performances. The classifier model could end up having negative samples, image chips actually contain objects but because of the missing labels the chips are considered “Not-Object”. A neural network will be trained to recognize object’s patterns, spatial features, colors as well as the background. Missing labels will pollute the training data by suggesting image patterns associated with a class shouldn’t be associated with a class.

Mislabeled classes

Mislabeling includes the classes which were not labeled correctly, including when multiple objects are mixed under one class. This creates a lot of added noise during model training. We also include mislabeled classes when multiple objects are mixed under one class.

Label duplication

Label duplication is present for both the livestock and small to midsize wildlife. During the model training, the model ends up making more predictions for over-labeled classes. During model evaluation, these duplicate labels must be assumed to be correct for the purposes of calculating metrics, since there is no efficient way to filter them out without manual editing.