A novel method of efficient spam mail classification using clustering techniques is presented in this research paper. E-mail spam is one of the major problems of the today's internet, bringing financial damage to companies and annoying individual users. Among the approaches developed to stop spam, filtering is an important and popular one. A new spam detection technique using the text clustering based on vector space model is proposed in this research paper. By using this method, one can extract spam/non-spam email and detect the spam email efficiently. Representation of data is done using a vector space model. Clustering is the technique used for data reduction. It divides the data into groups based on pattern similarities such that each group is abstracted by one or more representatives. Recently, there is a growing emphasis on exploratory analysis of very large datasets to discover useful patterns, it is called data mining. Each cluster is abstracted using one or more representatives. It models data by its clusters. Clustering is a type of classification imposed on a finite set of objects. If the objects are characterized as patterns, or points in a n-dimensional metric space, the proximity measure can be the Euclidean distance between pair of points or similarity in the form of the cosine of the angle between the vectors corresponding to the documents. In the work considered in this paper, an efficient clustering algorithm incorporating the features of K-means algorithm and BIRCH algorithm is presented. Nearest neighbour distances and K-Nearest neighbour distances can serve as the basis of classification of test data based on supervised learning. Predictive accuracy of the classifier is calculated for the clustering algorithm. Additionally, different evaluation measures are used to analyze the performance of the clustering algorithm developed in combination with the various classifiers. The results presented at the end of the paper in the results section show the effectiveness of the proposed method. General TermsClassification, Data reduction, Vector space model, Preprocessing KeywordsKeywords are your own designated keywords which can be used for easy location of the manuscript using any search engines.
In this research paper, a novel method of improving the clustered distributed indices for efficient text retrieval using threads is presented. In text retrieval, text search refers to
Regression testing is applied to a modified program to ensure no new errors are introduced in previously tested code. This testing is computationally intensive and time consuming. Hence, Regression Test Selection (RTS) techniques are used to select necessary test cases for testing the modified program. The test cases selected from the test suite should potentially identify same errors as identified by running the complete test suite. The existing RTS techniques address only the web services and standalone applications. In this paper, we present a technique to select regression test cases for web based Java applications. Majority of the web applications follow Service Oriented Architecture (SOA). The server side components are often distributed and work independently on different nodes. Testing these components is difficult and time consuming. The distributed and batch nature of application leads to major challenges in testing them effectively. We list out some of these challenges and present our tool SoRTEA (Selection of Regression Test for Enterprise Application), which addresses few of them in selection of the regression test cases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.