We created a pipeline to boost human mapping speed by 33x when tracing high-voltage infrastructure. At a high level, we used machine learning to find satellite imagery tiles that were most likely to contain HV towers and then pass this information to our Data Team -- a group of professional mappers -- to trace the HV infrastructure. This strategy was testing in three countries: Pakistan, Nigeria, and Zambia comprised of a total land area of approximately 2.7 million km2. All edits were made in OpenStreetMap, which is openly available. Individual changes to OSM are also available in the Github repo
Throughout the course of this project, we confirmed that neither humans nor an automated system alone are currently a feasible approach for mapping HV infrastructure. One one hand, professional mappers are very accurate and know when to ask for confirmation on difficult imagery. However, the time required to manually review an entire country is tremendous. We estimated it would have taken our Data Team about 6 months of full time effort to complete Pakistan alone. On the other hand, ML algorithms can operate with very high throughput and very little oversight once trained. Pakistan required only several days of computation time and a few hours of human effort to monitor the scripts. Nevertheless, our ML results indicated that it would be practically impossible to train an algorithm as accurate as a human. Combining them in an Intelligence Augmentation (IA) approach leveraged the strengths of both humans and machines.
The IA approach is also prudent in comparison to a pure AI strategy focused on completely replacing humans. By keeping a human in the loop, we can make sure all ML predictions are validated by a professional mapper before the additions are incorporated into OSM. The OSM community is (rightly) skeptical of any method that add edits without human verification as this strategy has led to issues in the past. Therefore, building a workflow utilizing the strengths of both is likely the optimal way forward. Future work should focus on improving the machine learning predictions and better incorporating those predictions into mapping editors (perhaps as plugins for standard map editors) so they are widely available to human mappers.
The rest of this discussion section is focused on improvements for future iterations of this mapping pipeline.
- Improving how we handle big data
- Efficiently downloading and storing large imagery datasets
- Matching download and prediction speeds on the fly
- Improving the machine learning predictions
- Detecting HV substations
- Improving HV tower detection
- Using additional forms of imagery (like SAR)
- Integrating ML predictions into human mapping workflows more effectively