The advent of high-throughput sequencing (HTS) has revolutionized the way in which epigenetic research is conducted. Often coupled with the availability of fully sequenced genomes, millions of small RNA (sRNA) reads are mapped to regions of interest and the results scrutinized for clues about epigenetic mechanisms. However, this approach requires careful consideration in regards to experimental design, especially when one investigates repetitive parts of genomes such as transposable elements (TEs), and especially when such genomes are large as is often the case in plants. Here, to shed light on the challenges of mapping sRNAs to TEs, we focus on the 2,300Mb maize genome, of which >85% is derived from TEs. We compare various methodological strategies that are commonly employed in TE studies. These include choices for the reference dataset, the normalization of multiple mapping sRNAs, and the selection among different types of sRNA metrics. We further examine how these choices influence the relationship between sRNAs and the critical feature of TE age, and explore and contrast their effect on low copy regions (exons) and other popular HTS data (RNA-seq). Finally, based on our analysis, we share a series of take-home messages to help guide TE epigenetic studies specifically, but our conclusions may also apply to any work that involves mapping and analysis of HTS data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.