Multivariate statistical process control charts are often used for process monitoring to detect out-of-control anomalies. However, multivariate control charts based on conventional statistical distance measures, such as the one used in the Hotelling's T 2 control chart, cannot scale up to large amounts of complex process data, e.g. data with a large number of variables and a high rate of data sampling. In our previous work we developed a multivariate statistical process monitoring procedure based on a more scalable chi-square distance measure and tested this procedure for detecting out-of control anomalies-intrusions-in a computer process using computer audit data. The testing results demonstrated the comparable performance of the scalable chi-square procedure to that of Hotelling's T 2 control chart. To establish the chi-square procedure as a generic, viable multivariate statistical processing monitoring procedure, we conduct a series of further studies to understand the detection power and limitations of the chi-square procedure for processes with various kinds of data and various types of out-of-control anomalies in addition to the scalability and demonstrated performance of the chi-square procedure for computer intrusion detection. This paper reports on one of these studies that investigates the effectiveness of the scalable chi-square procedure in detecting out-of-control anomalies in processes with uncorrelated data variables, each of which has a normal probability distribution. The results of this study indicate that the chi-square procedure is at least as effective as Hotelling's T 2 control chart for monitoring processes with uncorrelated data variables.
Standard multivariate statistical process control (SPC) techniques, such as Hotelling's T 2 , cannot easily handle large-scale, complex process data and often fail to detect out-of-control anomalies for such data. We develop a computationally efficient and scalable Chi-Square (χ 2 ) Distance Monitoring (CSDM) procedure for monitoring large-scale, complex process data to detect out-of-control anomalies, and test the performance of the CSDM procedure using various kinds of process data involving uncorrelated, correlated, auto-correlated, normally distributed, and nonnormally distributed data variables. Based on advantages and disadvantages of the CSDM procedure in comparison with Hotelling's T 2 for various kinds of process data, we design a hybrid SPC method with the CSDM procedure for monitoring largescale, complex process data.
If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service information about how to choose which publication to write for and submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information.
About Emerald www.emeraldinsight.comEmerald is a global publisher linking research and practice to the benefit of society. The company manages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as well as providing an extensive range of online products and additional customer resources and services.Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.