We propose ~ kana-ka'~ji conversion systeln with input support based on prediction. This system is composed o[' l,wo pm'ts: prediction of succeeding ka'1~:ji charac/;er st, rings from l,yped l,:ana ones, and ordinary k(vna-lca'l~(]i conversion. It automatically shows candidates of kanji character strings which the user intends to input. Ore' prediction method features: (i)Arbitrary positions of typed t,:ana character slarings are regm'ded as l,he top of words. (ii)A system dictionary and a user dictionary are used, and eadl entry in the systcln dictionary has (:erl, ai.nly factor calculated fl'om the frequency of' words in corpora. (ill)Candidates are estimated by certainty factor and 'us@l, lness factor, and likely ones with greater ['actors than l;hresholds are shown. The proposed system could reduce the user's key inlmt operations to 78% from the original ones in ore' experinmnts.
A text mining method using domain-dependent dictionaries can classify text data with various viewpoints. The method uses a key concept dictionary, which stores important words and phrases for domains. Also, the method uses a concept relation dictionary, which is a rule set consisted of their combination. In the method, the knowledge dictionaries are very important and give a strong influence to classification results. However, we have to generate the dictionaries through trial and error. It is difficult to apply the method to many tasks. In this paper, we try to learn a concept relation dictionary automatically. The method extracts key concepts using lexical analysis from text data, generates training examples from the concepts and their classes given by a human expert, and applies the examples to a fuzzy inductive learning algorithm, IDF. Also, the paper shows the method acquires an appropriate rule set by numerical experiments based on 10-fold cross validation and using more than 1,000 daily business reports.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.