Deep learning algorithms offer the potential to rapidly extract foundational mapping information directly from satellite imagery. Such algorithms can map regions of interest far faster than human labelers (making them especially valuable for mapping in the immediate wake of natural disasters) but they generally require a large amount of training data to build useful models. Such training data is often expensive and time-consuming to gather. In this session, we explore the dependence of model performance on the amount of training data available, for the case of a typical building footprint extraction effort.
Training data sizes ranging over nearly three orders of magnitude are drawn from the thousands of square kilometers of labeled imagery in the SpaceNet dataset. We demonstrate that an optimal range exists in training data size, beyond which more data brings only limited performance gains. This range provides an important target for future labeling campaigns and deployment efforts. Using best practices for placing error bars on machine learning performance metrics increases the value of the results. Finally, we also discuss analyses of performance variance with multiple geographies and deep learning model architectures, considerations that affect the transferability of these results.