Anomalies are those records, which have different behavior and do not comply with the remaining records in the dataset. Outlier analysis is the concept to find anomalies in Datasets.  Detecting outliers efficiently is an important issue in many fields of science, medicine and technology. Many methods are available to detect anomalies in numerical datasets but a limited number of methods available for categorical datasets. In this work, a novel method to detect outliers in categorical data based on entropy is proposed. This algorithm finds anomalies based on each record score and has great intuitive appeal. These scores called BAD scores. This algorithm utilizes the frequency of each value in the dataset. Greedy method needs k- scans of dataset to find ‘k’ outliers where as the proposed method needs only one scan of dataset and it calculates BAD score of each record directly. It avoids the problem of giving ‘k’ as an input and can find any number of outliers based on our data set directly.AVF method has less time complexity when compared with the other methods like Greedy, FPOF and FDOD. Greedy has good accuracy when compared with other methods like AVF and FPOF, FDOD (which are based on frequency patterns of all combinations of values in each record). Our algorithm shows better results in accuracy than AVF algorithm and Greedy. But this method has reached nearest to AVF in time complexity. This algorithm has been applied on Nursery dataset and Bank dataset taken from “UCI Machine Learning Repositoryâ€. In this work, it is proposed to extend Normal distribution [11], and Fuzzy concept [12] to BAD score [13] that is NAVF combined with Fuzzy AVF is applied to BAD Score.  Numerical attributes are excluded from Datasets for our analysis. The experimental results show that it is efficient for outlier detection in categorical dataset.
Students with different behaviors joined in the educational institutions create different problems in class. To bring them in right path, mentors should be able to find such candidates in the class. Since these students are different in behavior, the teaching faculty should not teach the common approach of teaching for all students. These people would have abnormal behavior when compared with other students. These students are treated as peculiar students. The Students data is almost mixed type of data. In this paper how these peculiar students are found using data mining techniques is presented. In this paper the techniques related to categorical attribute data are used. The data is collected from B.Tech students from different colleges for experiments using ILS questionnaire [1]. We have also investigated the relationship of peculiarity with learning styles.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.