Sylvie Calabretto scite author profile

Machine learning and data mining techniques have been used extensively in order to detect credit card frauds. However, most studies consider credit card transactions as isolated events and not as a sequence of transactions.In this framework, we model a sequence of credit card transactions from three different perspectives, namely (i) The sequence contains or doesn't contain a fraud (ii) The sequence is obtained by fixing the cardholder or the payment terminal (iii) It is a sequence of spent amount or of elapsed time between the current and previous transactions. Combinations of the three binary perspectives give eight sets of sequences from the (training) set of transactions. Each one of these sequences is modelled with a Hidden Markov Model (HMM). Each HMM associates a likelihood to a transaction given its sequence of previous transactions. These likelihoods are used as additional features in a Random Forest classifier for fraud detection.Our multiple perspectives HMM-based approach offers automated feature engineering to model temporal correlations so as to improve the effectiveness of the classification task and allows for an increase in the detection of fraudulent transactions when combined with the state of the art expert based feature engineering strategy for credit card fraud detection.In extension to previous works, we show that this approach goes beyond ecommerce transactions and provides a robust feature engineering over different datasets, hyperparameters and classifiers. Moreover, we compare strategies to deal with structural missing values.

show abstract

Multiple perspectives HMM-based feature engineering for credit card fraud detection

Lucas

Portier

Laporte

et al. 2019

View full text Add to dashboard Cite

Machine learning and data mining techniques have been used extensively in order to detect credit card frauds. However, most studies consider credit card transactions as isolated events and not as a sequence of transactions.In this article, we model a sequence of credit card transactions from three different perspectives, namely (i) does the sequence contain a Fraud? (ii) Is the sequence obtained by fixing the card-holder or the payment terminal? (iii) Is it a sequence of spent amount or of elapsed time between the current and previous transactions? Combinations of the three binary perspectives give eight sets of sequences from the (training) set of transactions. Each one of these sets is modelled with a Hidden Markov Model (HMM). Each HMM associates a likelihood to a transaction given its sequence of previous transactions. These likelihoods are used as additional features in a Random Forest classifier for fraud detection. This multiple perspectives HMM-based approach enables an automatic feature engineering in order to model the sequential properties of the dataset with respect to the classification task. This strategy allows for a 15% increase in the precision-recall AUC compared to the state of the art feature engineering strategy for credit card fraud detection.

show abstract

Enhancing Recommendation Diversity using Determinantal Point Processes on Knowledge Graphs

Gan

Nurbakova

Laporte

et al. 2020

View full text Add to dashboard Cite

Top-N recommendations are widely applied in various real life domains and keep attracting intense attention from researchers and industry due to available multi-type information, new advances in AI models and deeper understanding of user satisfaction.While accuracy has been the prevailing issue of the recommendation problem for the last decades, other facets of the problem, namely diversity and explainability, have received much less attention. In this paper, we focus on enhancing diversity of top-N recommendation, while ensuring the trade-off between accuracy and diversity. Thus, we propose an effective framework DivKG leveraging knowledge graph embedding and determinantal point processes (DPP). First, we capture different kinds of relations among users, items and additional entities through a knowledge graph structure. Then, we represent both entities and relations as k-dimensional vectors by optimizing a margin-based loss with all kinds of historical interactions. We use these representations to construct kernel matrices of DPP in order to make top-N diversified predictions. We evaluate our framework on MovieLens datasets coupled with IMDb dataset. Our empirical results show substantial improvement over the state-of-the-art regarding both accuracy and diversity metrics.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.