From a philosophical point of view, the words of a text or a speech are not held just for informational purposes, but they act and react; they have the power to react on their counterparts. Each word, evokes similar or different senses that can influence and interact with the following words, it has a vibratory property. It's not the words themselves that have the impact, but the semantic reaction behind the words. In this context, we propose a new textual data classification approach while trying to imitate human altruistic behavior in order to show the semantic altruistic stakes of natural language words through statistical, semantic and distributional analysis. We present the results of a word extraction method, which combines a distributional proximity index, a selection coefficient and a co-occurrence index with respect to the neighborhood.
Text indexing aims to take the full advantage of textual data to help intelligent programs to make relevant decisions. In order to explore a large amount of textual documents, and to disclose semantic information hidden in unstructured documents, like texts, an effective indexation system is required. In this paper, we propose a new approach for indexing Arabic texts. Based on the semantic proximity and taking into account the contexts contained in each document, our method is denoted contextual indexing. Several algorithms are used for keywords extraction, each of them emphasizes some criterion. However, we target the most descriptive keywords for each document. We also propose a new approach for document modeling. We compared the results obtained using our method with those obtained by an indexation system based on a standard statistical method. The experimental results demonstrate the performance of our approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.