2023
DOI: 10.3390/app13158766
|View full text |Cite
|
Sign up to set email alerts
|

HBTBD: A Heterogeneous Bitcoin Transaction Behavior Dataset for Anti-Money Laundering

Jialin Song,
Yijun Gu

Abstract: In this paper, we predict money laundering in Bitcoin transactions by leveraging a deep learning framework and incorporating more characteristics of Bitcoin transactions. We produced a dataset containing 46,045 Bitcoin transaction entities and 319,311 Bitcoin wallet addresses associated with them. We aggregated this information to form a heterogeneous graph dataset and propose three metapath representations around transaction entities, which enrich the characteristics of Bitcoin transactions. Then, we designed… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…This reflects the popularity of machine learning in financial data analysis and its ability to Tree is one of the most frequently used algorithms with four appearances, demonstrating its effectiveness in understanding and classifying financial transaction patterns (Alkhalili et al, 2021;Kanamori et al, 2022;Masrom et al, 2023;Ruiz & Angelis, 2022). Furthermore, Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) appeared three times each, underlining the importance of graph network analysis in suspicious activity detection (Naveed et al, 2023a;Pocher et al, 2022;Song & Gu, 2023). Random Forest, Support Vector Machine, and Gradient Boosted Tree each appeared three times, showing their prevalence and effectiveness in dealing with complex financial data (Alkhalili et al, 2021;Alotibi et al, 2022;Labanca et al, 2022;Masrom et al, 2023;Ruiz & Angelis, 2022;Pocher et al, 2022;Ruchay et al, 2023;Zhang & Trubey, 2019) .…”
Section: Resultsmentioning
confidence: 93%
See 1 more Smart Citation
“…This reflects the popularity of machine learning in financial data analysis and its ability to Tree is one of the most frequently used algorithms with four appearances, demonstrating its effectiveness in understanding and classifying financial transaction patterns (Alkhalili et al, 2021;Kanamori et al, 2022;Masrom et al, 2023;Ruiz & Angelis, 2022). Furthermore, Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) appeared three times each, underlining the importance of graph network analysis in suspicious activity detection (Naveed et al, 2023a;Pocher et al, 2022;Song & Gu, 2023). Random Forest, Support Vector Machine, and Gradient Boosted Tree each appeared three times, showing their prevalence and effectiveness in dealing with complex financial data (Alkhalili et al, 2021;Alotibi et al, 2022;Labanca et al, 2022;Masrom et al, 2023;Ruiz & Angelis, 2022;Pocher et al, 2022;Ruchay et al, 2023;Zhang & Trubey, 2019) .…”
Section: Resultsmentioning
confidence: 93%
“…The algorithm selection depends on its potential effectiveness in identifying illegal activities and detecting money laundering within financial data (Alotibi et al, 2022). This relates to the complexity of the transaction patterns of each financial institution (Song & Gu, 2023). AI has the capacity to analyze data on a large scale and recognize patterns and suspicious activities.…”
Section: Resultsmentioning
confidence: 99%
“…It would be beneficial to repeat this study on multiple realistic AML databases with various model configurations, to check the performance. Additionally, the inclusion and combination of ideas from other implementations, such as edge-attributes (Johannessen & Jullum, 2023) or variants of GNN transformers (Song & Gu, 2023) could lead to better prediction scores on such datasets.…”
Section: Discussionmentioning
confidence: 99%
“…Secondly, unlike previous research (Johannessen & Jullum, 2023;Song & Gu, 2023) that relied on different assumptions, such as edge-feature utilisation, and extends other GNN architectures, the presented approach focuses on developing a heterogeneous GIN that can effectively handle graphstructured data. Moreover, this study employed the publicly available, already graph-enhanced FinCEN files investigation data, which is not subject to legal restrictions or synthetic data limitations.…”
Section: Introductionmentioning
confidence: 99%