Current research in data mining concentrates on the development of new techniques for mining high-speed data streams. The fundamental data generation mechanism changes over the time, this is common in most real-world data streams, which introduces concept drift into the data. Mobile devices, streaming, remote sensing applications which are networked digital information systems, encounter the issue of the size of data and the capacity to be adaptive to changes in concept in real-time. In this paper the main issue of concept drift is addressed with real and synthetic data streams and the comparison of ensemble classifiers has been made in view of concept drift for the assessment of the performance. Various classifiers were applied on data stream with and without concept drift for analysis. This has resulted in better performance of the classifiers on the type of data whether it is categorical, numeric or alphanumeric.
Information Retrieval deals with retrieving documents from a large collection that matches the information need of a user. Efficient retrieval is based on the proper storage of the inverted index. There have been many techniques for reducing the size of the inverted index. Static index pruning is one such technique, which is used to reduce the index size. This paper investigates a static index pruning approach which is useful to reduce the index size. The proposed approach prunes the entire document from the index based on its importance and relevance of top-k results. The elimination takes place on the basis of the score of the individual document. Experiments have been conducted on the FIRE text collection. Based on the results, it was found that for specific collections, the proposed model gives better precision values for the retrieval of top 30 and above documents.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.