We develop a similarity measure for medical event sequences (MESs) and empirically evaluate it using U.S. Medicare claims data. Existing similarity measures do not use unique characteristics of MESs and have never been evaluated on real MESs. Our similarity measure, the Optimal Temporal Common Subsequence for Medical Event Sequences (OTCS-MES), provides a matching component that integrates event prevalence, event duplication, and hierarchical coding, important elements of MESs. The OTCS-MES also uses normalization to mitigate the impact of heavy positive skew of matching events and compact distribution of event prevalence. We empirically evaluate the OTCS-MES measure against two other measures specifically designed for MESs, the original OTCS and Artemis, a measure incorporating event alignment. Our evaluation uses two substantial data sets of Medicare claims data containing inpatient and outpatient sequences with different medical event coding. We find a small overlap in nearest neighbors among the three similarity measures, demonstrating the superior design of the OTCS-MES with its emphasis on unique aspects of MESs. The evaluation also provides evidence about the impact of component weights, neighborhood size, and sequence length.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.