The field of digital economy income tax compliance is still in its infancy. The limited collection of government income taxes has forced the Inland Revenue Board of Malaysia (IRBM) to develop a solution to improve the tax compliance of the digital economy sector so that its taxpayers may report voluntary income or take firm action. The ability to diagnose the taxpayer's compliance will ensure the IRBM effectively collects the income tax and gives revenues to the country. However, it gives challenges in extracting necessary knowledge from a large amount of data, leading to the need for a predictive model to detect the taxpayers' compliance level. This paper proposes the descriptive and predictive analytics models for predicting the digital economic income tax compliance in Malaysia. We conduct descriptive analytics to explore and extract a summary of data for initial understanding. Through a brief description of the descriptive model, the data distribution in a histogram shows that the information extracted can give a clear picture in influencing the results to classify digital economic tax compliance. In predictive modeling, single and ensemble approaches are employed to find the best model and important factors contributing to the incompliance of tax payment among the digital economic retailers. Based on the validation of training data with the presence of seven single classifier algorithms, three performance improvements have been established through ensemble classification, namely wrapper, boosting, and voting methods, and two techniques involving grid search and evolution parameters. The experimental results show that the ensemble method can improve the single classification model's accuracy with the highest classification accuracy of 87.94% compared to the best single classification model. The knowledge analysis phase learns meaningful features and hidden knowledge that could classify the contexts of taxpayers that could potentially influence the degree of tax compliance in the digital economy are categorized. Overall, this collection of information has the potential to help stakeholders make future decisions on the tax compliance of the digital economy.
The data stream is a series of data generated at sequential time from different sources. Processing such data is very important in many contemporary applications such as sensor networks, RFID technology, mobile computing and many more. The huge amount data generated and frequent changes in a short time makes the conventional processing methods insufficient. The Sliding Window Model (SWM) was introduced by Datar et. al to handle this problem. Avoiding multiple scans of the whole data sets, optimizing memory usage, and processing only the most recent tuple are the main challenges. The number of possible world instances grows exponentially in uncertain data and it is highly difficult to comprehend what it takes to meet Top-k query processing in the shortest amount of time. Following the generation of rules and the probability theory of this model, a framework was anticipated to sustain top-k processing algorithm over the SWM approach until the candidates expired. Based on the literature review study, none of the existing work have been made to tackle the issue arises from the top-k query processing of the possible world instance of the uncertain data streams within the SWM. The major issue resulted from these scenarios need to be addressed especially in the computation redundancy area that contributed to the increases of computational cost within the SWM. Therefore, the main objective of this research work is to propose the top-k query processing methods over uncertain data streams in SWM utilizing the score and the Possible World (PW) setting. In this study, a novel expiration and object indexing method is introduced to address the computational redundancy issues. We believed the proposed method can reduce computational costs and by managing insertion and exit policy on the right tuple candidates within a specified window frame. This research work will contribute to the area of computational query processing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.