In this paper, we propose the use of the online learning clustering algorithm, named as SOStream, to fault detection of industrial process operation. In order to carry out this activity, it was necessary to adapt the original algorithm for avoiding fault data to be approximately normal data clusters. Aiming to evaluate the performance of the SOStream, we performed a case study using the industrial plant simulator Tennessee Eastman Process. The analysis was done taking that into account multivariable monitoring scenarios, obtaining results above 90% in accuracy and recall. This, it was observed that the analyzed approach obtained good performance to fault detection. In addition, the algorithm SOStream has characteristics very interesting for real-time fault detection, such as unsupervised and online learning, data-driven approach and recursive, which does not require storing the previous samples. Resumo: Neste artigo, propomos a utilização do algoritmo evolutivo de clusterização online SOStream para detecção de falhas em processos industriais. Para isso, a metodologia proposta inclui uma adaptação no algoritmo original para evitar que dados de falha sejam sejam aproximados para clusters de dados normais. A fim de validar a metodologia, um estudo de caso utilizando o simulador de plantas industriais Tennessee Eastman Process foi realizado. A análise foi feita em uma sequência contínua de dados, denominada de fluxo ou stream de dados, e de forma multivariável, obtendo-se resultados acima de 90% em precisão e recall. Essa abordagem evolutiva por ser baseada em dados e não supervisionada, torna o processo de detecção de falhas mais simples e com menor custo computacional quando comparado com as técnicas baseadas em modelos, além de possuir um dinamismo recursivo, não necessitando armazenar o histórico das amostras anteriores.
This paper proposes a new evolving algorithm named Macro SOStream with entirely online learning and based on self-organizing density for data stream clustering. The Macro SOStream is based on the SOStream algorithm, but we incorporate macroclusters composed of microclusters. While microclusters have spherical shapes, macroclusters can have arbitrary shapes. Moreover, the Macro SOStream has the macrocluster merge functionality specially designed to improve its performance under data drift contexts. The Macro SOStream’s performance is compared to SOStream and DenStream algorithms’ performance using four synthetic datasets and the ARI performance metric to validate our proposal. Furthermore, we carry out an exhaustive analysis on the influence of adequate hyperparameter setup on these algorithms’ performance. As a result, the Macro SOStream presents good performance mainly in the context of data drift and for demands of non-spherical clusters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.