Sequential pattern mining algorithms extract trendy sequence appearances inside ordered transactional datasets such as market basket datasets. There is a lack of research employing big data processing techniques to locate frequent sequences on large‐scale datasets. Furthermore, there is a need for optimized sequential pattern mining algorithms that run on ordered one‐dimensional sequences. We also observe a lack of sequential pattern search studies in the literature, where the focus is centered around multi‐dimensional data sequences. Existing approaches that deal with ordered one‐dimensional datasets suffer from scalability issues as the amount of data to be analyzed is enormous. This research investigates the big data processing techniques used to find frequent sequences in large‐scale datasets. It also proposes a scalable sequence pattern mining algorithm called Sequential Pattern Acquisition by Reducing Search Space (SPARSS) designed for distributed data processing systems that efficiently handle large datasets containing sequential one‐element data. It introduces a prototype implementation of SPARSS and provides information on the SPARSS's memory and time requirements, which were calculated as part of experimental studies on a real‐world dataset. The results confirm our expectations and demonstrate SPARSS's superior scalability and run‐time efficiency compared to other distributed algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.