Detection and segmentation of objects in overheard imagery is a challenging task. The variable density, random orientation, small size, and instance-to-instance heterogeneity of objects in overhead imagery calls for approaches distinct from existing models designed for natural scene datasets. Though new overhead imagery datasets are being developed, they almost universally comprise a single view taken from directly overhead ("at nadir"), failing to address a critical variable: look angle. By contrast, views vary in real-world overhead imagery, particularly in dynamic scenarios such as natural disasters where first looks are often over 40 • off-nadir. This represents an important challenge to computer vision methods, as changing view angle adds distortions, alters resolution, and changes lighting. At present, the impact of these perturbations for algorithmic detection and segmentation of objects is untested. To address this problem, we present an open source Multi-View Overhead Imagery dataset, termed SpaceNet MVOI, with 27 unique looks from a broad range of viewing angles (−32.5 • to 54.0 • ). Each of these images cover the same 665 km 2 geographic extent and are annotated with 126,747 building footprint labels, enabling direct assessment of the impact of viewpoint perturbation on model performance. We benchmark multiple leading segmentation and object detection models on: (1) building detection, (2) generalization to unseen viewing angles and resolutions, and (3) sensitivity of building footprint extraction to changes in resolution. We find that state of the art segmentation and object detection models struggle to identify buildings in off-nadir imagery and generalize poorly to unseen views, presenting an important benchmark to explore the broadly relevant challenge of detecting small, heterogeneous target objects in visually dynamic contexts.
Transfer learning (TL) has proven to be a transformative technology for computer vision (CV) and natural language processing (NLP) applications, offering improved generalization, state-of-theart performance, and faster training time with less labelled data. As a result, TL has been identified as a key research area in the budding field of radio frequency machine learning (RFML), where deployed environments are constantly changing, data is hard to label, and applications are often safety-critical. TL literature and theory shows that TL is generally successful when the source and target domains and tasks are similar, but the term similar is not sufficiently defined. Therefore, quantifying dataset similarity is of importance for analyzing and potentially predicting TL performance, and also has further application in RFML dataset design. This work offers a dataset similarity metric, specifically designed for raw RF datasets, based on expert-defined features and χ 2 tests, and systematically evaluates the proposed metric using synthetic datasets with carefully curated signal-to-noise ratios (SNRs), frequency offsets (FOs), and modulation types. Results show that the proposed dataset similarity metric intuitively quantifies the notion of similar signal sets, so long as the expert-features used to construct the metric are well suited to the application.INDEX TERMS dataset similarity, deep learning (DL), machine learning (ML), radio frequency machine learning (RFML), transferability, transfer learning (TL)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.