Small open reading frames (small ORFs/sORFs/smORFs) are potentially coding sequences smaller than 100 codons that have historically been considered junk DNA by gene prediction software and in annotation screening; however, the advent of next-generation sequencing has contributed to the deeper investigation of junk DNA regions and their transcription products, resulting in the emergence of smORFs as a new focus of interest in systems biology. Several smORF peptides were recently reported in noncanonical mRNAs as new players in numerous biological contexts; however, their relevance is still overlooked in coding potential analysis. Hence, this review proposes a smORF classification based on transcriptional features, discussing the most promising approaches to investigate smORFs based on their different characteristics. First, smORFs were divided into nonexpressed (intergenic) and expressed (genic) smORFs. Second, genic smORFs were classified as smORFs located in noncoding RNAs (ncRNAs) or canonical mRNAs. Finally, smORFs in ncRNAs were further subdivided into sequences located in small or long RNAs, whereas smORFs located in canonical mRNAs were subdivided into several specific classes depending on their localization along the gene. We hope that this review provides new insights into large-scale annotations and reinforces the role of smORFs as essential components of a hidden coding DNA world.
The hypersaline lagoon system of Araruama (HLSA) is one of the largest in the world and one of the most important sources of evaporative salt in Brazil. The biogeochemical characteristics of this lagoon system led it to be considered a Precambrian relic. The HLSA also harbors extensive microbial mats, but the taxonomic and metabolic attributes of these mats are poorly understood. Our high-throughput metagenomics analyses demonstrated that the HLSA microbial mats are dominated by Proteobacteria, Cyanobacteria, and Bacteroidetes. Among Proteobacteria, Deltaproteobacteria comprises approximately 40% of the total population and it includes sulfate-reducing bacteria such as Desulfobacterales, Desulfuromonadales, and Desulfovibrionales. Differing in composition and function of their reaction centers, other phylogenetic diverse anoxygenic phototrophic bacteria were detected in the HLSA microbial mats metagenomes. The presence of photolithoautotrophs, sulfate reducers, sulfide oxidizers, and aerobic heterotrophs suggests the existence of numerous cooperative niches that are coupled and regulated by microbial interactions. We suggest that the HLSA microbial mats hold microorganisms and the necessary machinery (genomic repertoire to sustain metabolic pathways) to promote favorable conditions (i.e., create an alkaline pH microenvironment) for microbially mediated calcium carbonate precipitation process. Metagenome-assembled genomes (Ca. Thiohalocapsa araruaensis HLSAbin6 sp. nov. and Ca. Araruabacter turfae HLSAbin9 gen. nov. sp. nov.) obtained support the relevance of Sulfur metabolism and they are enriched with genes involved in the osmoadaptive networks, hinting at possible strategies to withstand osmotic stress. Metabolically versatile bacteria populations, able to use multiple nutrient sources and osmolytes, seem to be a relevant attribute to survive under such stressful conditions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.