SignificanceThe detection of frauds is one of the most prominent applications of the Newcomb–Benford law for significant digits. However, no general theory can exactly anticipate whether this law provides a valid model for genuine, that is, nonfraudulent, empirical observations, whose generating process cannot be known with certainty. Our first aim is then to establish conditions for the validity of the Newcomb–Benford law in the field of international trade data, where frauds typically involve huge amounts of money and constitute a major threat for national budgets. We also provide approximations to the distribution of test statistics when the Newcomb–Benford law does not hold, thus opening the door to the development of statistical procedures with good inferential properties and wide applicability.
The Newcomb-Benford law for digit sequences has recently attracted interest in anti-fraud analysis. However, most of its applications rely either on diagnostic checks of the data, or on informal decision rules. We suggest a new way of testing the Newcomb-Benford law that turns out to be particularly attractive for the detection of frauds in customs data collected from international trade. Our approach has two major advantages. The first one is that we control the rate of false rejections at each stage of the procedure, as required in anti-fraud applications. The second improvement is that our testing procedure leads to exact significance levels and does not rely on large-sample approximations. Another contribution of our work is the derivation of a simple expression for the digit distribution when the Newcomb-Benford law is violated, and a bound for a chi-squared type of distance between the actual digit distribution and the Newcomb-Benford one.
Benford's law defines a probability distribution for patterns of significant digits in real numbers. When the law is expected to hold for genuine observations, deviation from it can be taken as evidence of possible data manipulation. We derive results on a transform of the significand function that provide motivation for new tests of conformance to Benford's law exploiting its sum-invariance characterization. We also study the connection between sum invariance of the first digit and the corresponding marginal probability distribution. We approximate the exact distribution of the new test statistics through a computationally efficient Monte Carlo algorithm. We investigate the power of our tests under different alternatives and we point out relevant situations in which they are clearly preferable to the available procedures. Finally, we show the application potential of our approach in the context of fraud detection in international trade.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.