Big data analytics have gained tremendous successes in mining valuable information in various fields. However, its potential to solve complex problems in hardware security has not been adequately tapped. This paper presents a non-invasive approach to identify the state registers of a finite state machine (FSM) in an integrated chip. The state registers of the FSM are mined from the scan-dump data by exploiting the strongly connected property and chronologically correlated state codes of the FSM. The sequence of data scanned out of each scan register is partitioned into non-overlapping strings of high weighted frequencies by a string-matching algorithm. A coherency between a pair of registers is defined and computed based on the partitioned strings. The dimension of the coherency matrix is first reduced by pruning some registers of low influence by a regression analysis. The registers are then clustered to minimize the within-cluster variances based on their coherency values. The proposed scheme is applied to some IP cores from OpenCores. The experimental results show that our scheme can correctly identify the FSM state registers in most designs with high hit rate.