Developed for UNICEF

Training data

Training data validation

Five expert mappers from DevSeed Data Team reviewed that dataset and compared it to the DG Vivid RGB imagery. The Data team classified each location into those where 1) overhead imagery clearly contain schools ('confirmed'), 2) overhead imagery clearly does not contain a school ('not-school'), 3) it is uncertain whether or not overhead imagery contains a school ('unrecognized'). Please refer to “Total Point After Validation” in Table 1.

The ‘YES’ schools are observed from the high-resolution satellite imagery and have very clear school features, e.g. building size, shape, and facilities. Here are some of the school features that were used as criteria for schools and that can be used to label the tiles as “confirmed” schools.

The ‘UNRECOGNIZED’ schools referred to school geolocations that were part of the original country school datasets but that had no clear school features, especially in urban areas with high building density or, in rural areas that can’t be distinguished from residential buildings. Another case of unrecognized schools is school building(s) that can not be seen on DG Vivid because of cloud/tree cover. These locations were not used in training the model.

The ‘No’ schools refer to locations from the original country school datasets where the expert mappers could not find any school-looking buildings at the provided school geolocations. As an example, some of the schools were mislocated in the middle of the ocean, desert, dense forest. This can be caused by the school geolocation being recorded incorrectly or because the DG Vivid imagery has been updated in particular areas of the selected countries after schools were built.