A video recording of an examination by Wireless Capsule Endoscopy (WCE) may typically contain more than 55,000 video frames, which makes the manual visual screening by an experienced gastroenterologist a highly time-consuming task. In this paper, we propose a novel method of epitomized summarization of WCE videos for efficient visualization to a gastroenterologist. For each short sequence of a WCE video, an epitomized frame is generated. New constraints are introduced into the epitome formulation to achieve the necessary visual quality for manual examination, and an EM algorithm for learning the epitome is derived. First, the local context weights are introduced to generate the epitomized frame. The epitomized frame preserves the appearance of all the input patches from the frames of the short sequence. Furthermore, by introducing spatial distributions for semantic interpretation of image patches in our epitome formulation, we show that it also provides a framework to facilitate the semantic description of visual features to generate organized visual summarization of WCE video, where the patches in different positions correspond to different semantic information. Our experiments on real WCE videos show that, using epitomized summarization, the number of frames have to be examined by the gastroenterologist can be reduced to less than one-tenth of the original frames in the video.
This paper presents a novel multi-level approach for bleeding detection in Wireless Capsule Endoscopy (WCE) images. In the low-level processing, each cell of K×K pixels is characterized by an adaptive color histogram which optimizes the information representation for WCE images. A Neural Network (NN) cell-classifier is trained to classify cells in an image as bleeding or non-bleeding patches. In the intermediate-level processing, a block which covers 3×3 cells is formed. The intermediate-level representation of the block is generated from the low-level classifications of the cells, which captures the spatial local correlations of the cell classifications. Again, a NN blockclassifier is trained to classify the blocks as bleeding or nonbleeding ones. In the high-level processing, the low-level cellbased and intermediate-level block-based classifications are fused for final detection. In this way, our approach can combine the low-level features from pixels and intermediate-level features from local regions to achieve robust bleeding detection.Experiments on real WCE videos have shown that the proposed method of multi-level classification is not only accurate in both detection and localization of potential bleedings in WCE images but also robust to complex local noisy features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.