This paper proposes a method for classifying the river state (a flood risk exists or not) from river surveillance camera images by combining patch-based processing and a convolutional neural network (CNN). Although CNN needs much training data, the number of river surveillance camera images is limited because flood does not frequently occur. Also, river surveillance camera images include objects that are irrelevant to the flood risk. Therefore, the direct use of CNN may not work well for the river state classification. To overcome this limitation, this paper develops patch-based processing for adjusting CNN to the river state classification. By increasing training data via the patch segmentation of an image and selecting patches that are relevant to the river state, the adjustment of general CNNs to the river state classification becomes feasible. The proposed patch-based processing and CNN are developed independently. This yields the practical merits that any CNN can be used according to each user’s purposes, and the maintenance and improvement of each component of the whole system can be easily performed. In the experiment, river state classification is defined as the following problems using two datasets, to verify the effectiveness of the proposed method. First, river images from the public dataset called Places are classified to images with Muddy labels and images with Clear labels. Second, images from the river surveillance camera in Nagaoka City, Japan are classified to images captured when the government announced heavy rain or flood warning and the other images.
In this paper, we propose a method for speech understanding using a corpus. First, the method extracts keywords from an N-best list of speech recognizer output. This process employs two measures: a confidence measure of speech recognizer output and an association probability between words. Next, the method uses dependencies between keywords in a corpus. Finally, it evaluates relations (tagged labels) of the dependencies. Our method obtained high accuracy as compared with a method with only the confidence measure of the speech understanding module. The results show the effectiveness of the proposed method.
A large number of studies have been made on denoising of a digital noisy image. In regression filters, a convolution kernel was determined based on the spatial distance or the photometric distance. In non-local mean (NLM) filters, pixel-wise calculation of the distance was replaced with patch-wise one. Later on, NLM filters have been developed to be adaptive to the local statistics of an image with introduction of the prior knowledge in a Bayesian framework. Unlike those existing approaches, we introduce the prior knowledge, not on the local patch in NLM filters but, on the noise bias (NB) which has not been utilized so far. Although the mean of noise is assumed to be zero before tone mapping (TM), it becomes non-zero value after TM due to the non-linearity of TM. Utilizing this fact, we propose a new denoising method for a tone mapped noisy image. In this method, pixels in the noisy image are classified into several subsets according to the observed pixel value, and the pixel values in each subset are compensated based on the prior knowledge so that NB of the subset becomes close to zero. As a result of experiments, effectiveness of the proposed method is confirmed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.