Epithelial-mesenchymal transition (EMT) is a change in cell shape and mobility that supports development and cancer metastasis. Multiple intermediate EMT states reflecting hybrid epithelial and mesenchymal phenotypes were observed in various physiological and pathological conditions. Previous theoretical models explaining the intermediate EMT states rely on multiple regulatory loops involving transcriptional feedback, and these models produced three or four attractors with a given set of rate constants, which is incompatible with experimentally observed non-genetic heterogeneity reflecting a continuum-like EMT spectrum. EMT is regulated by many microRNAs that typically bind transcripts of EMT-related genes via multiple binding sites. It was unclear whether post-transcriptional regulations associated with the microRNA binding sites alone can stabilize intermediate EMT states. Here, we used models describing the post-transcriptional regulations with elementary reaction networks, finding that cooperative RNA degradation via multiple microRNA binding sites can generate four-attractor systems without transcriptional feedback. We identified many specific, experimentally supported instances of network structures predicted to permit intermediate EMT states. Furthermore, transcriptional feedback and the newly identified intermediates-enabling circuits can be combined to produce even more intermediate EMT states in both modular and emergent manners. Finally, multisite-mediated cooperative RNA degradation can increase the distribution of gene expression in the EMT spectrum and support the phenotypic continuum without the need of higher noise. Our work reveals a previously unknown possible role of cooperative RNA degradation and microRNA in EMT, providing a theoretical framework that can help to bridge the gap between mechanistic models and single-cell experiments.