A consumer photograph, or snapshot, is a medium for conveying to a viewer, one's interest in one or more main subjects. A methodology is presented for collecting ground truth data useful for training and evaluating algorithms designed to automatically detect the main subject of a consumer photograph. For a database of 100 images, 16 observers provided polygonal approximations to the image areas that comprise the main subject. Results from all observers are combined to form a truth image that is considered the ideal result of a main subject detector and is analyzed to determine features for main subject detection (MSD) . The collected ground truth shows substantial agreement among third-party observers. It also supports conventional wisdom regarding the likely locations of main subjects and the value of "people" detection as a cue for main subject detection. Training data is created from the truth images for an MSD framework involving image segmentation, feature detection, and probabilistic reasoning. A proposed method for generating region-based training data can be used to retrain a reasoning engine as segmentation algorithms improve, without further observer involvement. Although the subject matter for consumer photographs ranges from sweeping landscapes to close portraits, identification of the main subject is a meaningful task.
Sky is among the most important subject matter frequently seen in photographic images. We propose a model-based approach consisting of color classification, region extraction, and physics-motivated sky signature validation. First, the color classification is performed by a multilayer backpropagation neural network trained in a bootstrapping fashion to generate a belief map of sky color. Next, the region extraction algorithm automatically determines an appropriate threshold for the sky color belief map and extracts connected components. Finally, the sky signature validation algorithm determines the orientation of a candidate sky region, classifies one-dimensional (1-D) traces within the region based on a physics-motivated model, and computes the sky belief of the region by the percentage of traces that fit the physics-based sky trace model. A small-scale, yet rigorous test has been conducted to evaluate the algorithm performance. With approximately half of the images containing blue sky regions, the detection rate is 96% with a false positive rate of 2% on a per image basis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.