In this paper we present a two-level method to detect text in natural scene images. In the first level, connected components (referred as CCs) are got from the images. Then candidate text lines are extracted and groups of connected components that align in horizontal or vertical direction are got. We think CCs in these groups have high probability are texts. To validate which CC is text, a SVM is trained to make an initial decision. The output of SVM is calibrated to posterior probability. Then we use the information of posterior probability of SVM and information of whether the connected component is in a group to divide the connected components into four classes: texts, non-texts, probable texts and undetermined CCs. In the second level, a conditional random field model is used to make final decision. Relationship between CCs is modeled by a network G(V, E), Vertices of the graph correspond to CCs. The determination in the first level will influence the second level's determination by giving different parameters of data term for the four classes of CCs. By this way, we not only use information of a single CC's feature, but also use the information of whether a CC is in a group to make final decision of whether the CC is text or nontext. Experiments show that the method is effective.
Abstract. Surveillance video system as an effective means of public safety widely appeared in people's life in recent years. Traditional video coding technology, the encoding process is complex, computation, and conflicts with hardware limitations of the monitoring system. This paper presents a kind method used to design and train redundant dictionary based on image region. Each frame is divided into same-sized area with equal blocks, then the image areas of the same position are put together as the training sample library of regional redundant dictionary. In this way, redundant dictionary compression results in the occupied volume can be reduce. On top of this dictionary, surveillance video compression coding algorithm based on the regional dictionary is designed and implemented. Experimental results show that the proposed algorithm can effectively improve the compression ratio of the algorithm.With the need for Social and public security and big data analysis, video surveillance system is increasingly applied to people's life [1] . Surveillance systems made of various cameras are lying over all positions of the society and provide services for people's safe life. According to IDC (Data Corporation International) research report estimates, the total amount of global data will reach 40ZB in 2020, among which the data of surveillance video is 5.8ZB. And China will account for 21%, which means that in 2020, China will have 1.2ZB (One billion and two hundred million TB) surveillance video data needed to be stored, transmitted and analyzed. Therefore, facing with massive surveillance video, considering the transmission and storage costs, we need to carry out new research and achieve a major breakthrough in surveillance video encoding and analysis technology, it is very necessary to research higher compression efficiency of surveillance video encoding technologies. This paper proposes a method for studying of balancing the redundant dictionary size and the overall algorithm of encoding effect. The way to reduce the volume of redundant dictionary can be found out by analysing the compression result. The redundancy of each image block in the way of training the dictionary itself can be reduced by improving the original training method. A redundant dictionary trained by a region which is as a simple formed by multiple unit image blocks is applied to all of the image block in the area. This method use of redundant information can reduce the number of redundant dictionary, improve the training efficiency further, and do not cause significant losses for the compression algorithm of encoding efficiency. This paper also proposes a algorithm that combined with surveillance video compression and coding algorithm based on key frame to build the public background of surveillance and use the redundant dictionary to sparse decomposition the background image block or the image block with the approximate background, then quantize and compress correspondingly, while the foreground image block is reserved separately. Decoding the fore...
Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc., are studied in compressed domain. A skin detection method based on data-mining in compressed domain is proposed firstly and achieves the higher detection accuracy as well as higher speed. Then, a cascade scheme of pornographic image recognition based on selective decision tree ensemble is proposed in order to improve both the speed and accuracy of recognition. A pornographic video oriented key frame extraction solution in compressed domain and an approach of pornographic video recognition are discussed respectively in the end.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.