Text normalization is an important component in mandarin Text-to-Speech system. This paper develops a taxonomy of Non-Standard Words (NSW's) based on a Large-scale Chinese corpus and proposes a three-stage text normalization strategy: Finite State Automata (FSA) for initial classification, Maximum Entropy (ME) Classifier & Rules for further classification and General Rules for standard word conversion. The three-stage approach achieves Precision of 96.02% in experiments, 5.21% higher than that of simple rule based approach and 2.21% higher than that of simple machine learning method. Experiments results show that the approach of three-stage disambiguation strategy for text normalization makes considerable improvement, and works well in real TTS system.
A major task in ground-based gamma-ray astrophysics analyses is to separate events caused by gamma rays from the overwhelming hadronic cosmic-ray background. In this talk we are interested in improving the gamma ray regime below 1 TeV, where the gamma and cosmic-ray separation becomes more difficult. Traditionally, the separation has been done in particle sampling arrays by selections on summary variables which distinguish features between the gamma and cosmic-ray air showers, though the distributions become more similar with lower energies. The structure of the HAWC observatory, however, makes it natural to interpret the charge deposition collected by the detectors as pixels in an image, which makes it an ideal case for the use of modern deep learning techniques, allowing for good performance classifers produced directly from low-level detector information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.