Encouraged by the feasibility demonstration that a relatively low-cost grid environment can speed up the processing of continuous text streams of financial news in English, we have attempted to replicate our methods for automatic sentiment analysis in two major languages of the world-Arabic and Chinese. We show that our local grammar approach, developed on an archive of English (Indo-European language) texts, works equally on the typologically different Chinese (Sino-Asiatic) and Arabic (Semitic) languages.
This paper discusses a method for extracting conceptual hierarchies from arbitrary domain-specific collections of text. These hierarchies can form a basis for a concept-oriented terminology collection, and hence may be used as the basis for developing knowledge-based systems via ontology editors. This reference to ontology is explored in the context of collections of terms. The method presented uses both statistical and linguistic techniques. The result of such an extraction may be useful in information retrieval, knowledge management, or in the discipline of terminology science itself.
This paper explores the use of texts that are related to an image collection, also known as collateral texts, for building thesauri in specialist domains to aid in image retrieval. Corpus linguistic and information extraction methods are used for identifying key terms and conceptual relationships in specialist texts that may be used for query expansion purposes. The specialist domain context imposes certain constraints on the language used in the texts, which makes the texts computationally more tractable. The effectiveness of such an approach is demonstrated through a prototype system that has been developed for the storage and retrieval of images and texts, applied in the forensic science domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.