Log analysis is an important technique that engineers use for troubleshooting faults of large-scale service-oriented systems. In this study, we propose a novel semi-supervised log-based anomaly detection approach, LogDP, which utilizes the dependency relationships among log events and proximity among log sequences to detect the anomalies in massive unlabeled log data. LogDP divides log events into dependent and independent events, then learns normal patterns of dependent events using dependency and independent events using proximity. Events violating any normal pattern are identified as anomalies. By combining dependency and proximity, LogDP is able to achieve high detection accuracy. Extensive experiments have been conducted on real-world datasets, and the results show that LogDP outperforms six state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.