Credit Risk Models for Financial Fraud Detection

Xia, Huosong; An, Wuyue; Zhang, Zuopeng

doi:10.4018/jdm.321739

Cited by 10 publications

(2 citation statements)

References 65 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Data imbalance refers to an asymmetrical distribution of data across different classes or categories within a dataset, whereby certain classes are underrepresented compared to others (He & Garcia, 2009). This issue can manifest in diverse contexts, including financial fraud detection (Xia & Zhang, 2023;Al-Shabi, 2019), medical diagnosis (Bridge et al, 2020), sentiment analysis (Al Shamsi & Abdallah, 2022), text analysis (Li et al, 2023), image classification (Tian & Han, 2022), and recommendation systems (Zhang et al, 2019).…”

Section: Contribution To Is Literaturementioning

confidence: 99%

Handling Imbalanced Data With Weighted Logistic Regression and Propensity Score Matching methods

Agrawal,

Mulgund,

Sharman

2024

Journal of Database Management

View full text Add to dashboard Cite

The adoption of empirical methods for secondary data analysis has witnessed a significant surge in IS research. However, the secondary data is often incomplete, skewed, and imbalanced at best. Consequently, there is a growing recognition of the importance of empirical techniques and methodological decisions made to navigate through such issues. However, there is not enough methodological guidance, especially in the form of a worked case study that demonstrates the challenges of imbalanced datasets and offers prescriptive on how to deal with them. Using data on P2P money transfer services, this article presents a running example by analyzing the same dataset using several different methods. It then compares the outcomes of these choices and explicates the rationale behind some decisions such as inclusion and categorization of variables, parameter setting, and model selection. Finally, the article discusses certain regressions models such as weighted logistic regression and propensity matching, and when they should be used.

show abstract

Section: Contribution To Is Literaturementioning

confidence: 99%

Handling Imbalanced Data With Weighted Logistic Regression and Propensity Score Matching methods

Agrawal,

Mulgund,

Sharman

2024

Journal of Database Management

View full text Add to dashboard Cite

show abstract

“…In SMOTE, additional minority samples are created along the line segment among the minority samples, although with no indication of any kind to the samples available in the confrontational majority class. SMOTE has been applied in various domains, including finance (Sun et al, 2020), fraud detection (Xia et al, 2023), medical diagnosis (Bokhare et al, 2023, Kamarulzalis et al, 2018 and image classification (Khan & Sheikh, 2023). It has shown promising results in improving the classification accuracy of models in these domains.…”

Section: Synthetic Samplingmentioning

confidence: 99%