HR's purpose is to assign the best people to the right job at the right time, train and qualify them, and provide evaluation methods to track their performance and safeguard employees' perspective skills. These data are crucial for decision-makers, but collecting the best and most useful information from such large amounts of data is tough. HR employees no longer need to manually handle vast amounts of data with the advent of data mining. The basic purpose of data mining is to extract information from hidden patterns and trends in data to get near-optimal results. This study aims at comparing the performance of three techniques in the prediction of performance. The dataset undergoes preprocessing steps that include data cleaning, and data compression using PCA. After preprocessing, training and classification were done using Artificial Neural Network, Random Forest, and Decision tree algorithm. The result showed that Artificial Neural networks performed the best for the prediction of employee performance.
Since data warehouses store and update enormous amounts of data from several sources, there is a potential that some of those references may contain inaccurate data. Due to the noise, inefficacy, and poor characterization of the vast amount of accessible data, as well as the ensuing insensitivity and inefficiencies of human data cleaning and labeling, the presentation of the data has become ambiguous, and the assessment of the information has become difficult. A hole in the creation of a better data analysis method was identified. This helped to guide the creation of a Python script for automatically cleaning and labeling data. The first step in the strategy used in this study to accomplish its goals and objectives was to obtain a financial dataset from the top database, "Kaggle". Create a machine learning (ML) approach in Python that intends to automate the financial dataset cleaning. This covers ingesting data, addressing incomplete data, addressing anomalies, one-hot wrapping and label encoding, extracting date and time values, and data normalization. Implementing an unsupervised machine learning method that attempts to automate financial dataset labeling (kmeans). Using the method includes the elbow principle, k-means clustering, data modeling of "age" versus "arrival," dimensionality reductions, computer vision, and dataset categorizing using the groupings. An empirical assessment of the cleaned and labeled automated trading dataset utilizing a comparison of the cleaned dataset before and after PCA adoption. The results show that the developed ML technique not only improved the performance of the audit data used in this study, but also classified the data after cleaning it and removing the unpleasant section and incomplete data, as shown by the k-means segmentation result and grouping by PCA.Povzetek: Razvili so skripto v Pythonu za avtomatsko čiščenje in označevanje finančnih podatkov ter podatke uporabili za strojno učenje za avtomatizacijo postopka.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.