Studies on hereditary fixation of the tame-behavior phenotype during animal domestication remain relevant and important because they are of both basic research and applied significance. In model animals, gray rats Rattus norvegicus bred for either an enhancement or reduction in defensive response to humans, for the first time, we used high-throughput RNA sequencing to investigate differential expression of genes in tissue samples from the tegmental region of the midbrain in 2-month-old rats showing either tame or aggressive behavior. A total of 42 differentially expressed genes (DEGs; adjusted p-value < 0.01 and fold-change > 2) were identified, with 20 upregulated and 22 downregulated genes in the tissue samples from tame rats compared with aggressive rats. Among them, three genes encoding transcription factors (TFs) were detected: Ascl3 was upregulated, whereas Fos and Fosb were downregulated in tissue samples from the brains of tame rats brain. Other DEGs were annotated as associated with extracellular matrix components, transporter proteins, the neurotransmitter system, signaling molecules, and immune system proteins. We believe that these DEGs encode proteins that constitute a multifactorial system determining the behavior for which the rats have been artificially selected. We demonstrated that several structural subtypes of E-box motifs—known as binding sites for many developmental TFs of the bHLH class, including the ASCL subfamily of TFs—are enriched in the set of promoters of the DEGs downregulated in the tissue samples of tame rats’. Because ASCL3 may act as a repressor on target genes of other developmental TFs of the bHLH class, we hypothesize that the expression of TF gene Ascl3 in tame rats indicates longer neurogenesis (as compared to aggressive rats), which is a sign of neoteny and domestication. Thus, our domestication model shows a new function of TF ASCL3: it may play the most important role in behavioral changes in animals.
The most popular model for the search of ChIP-seq data for transcription factor binding sites (TFBS) is the positional weight matrix (PWM). However, this model does not take into account dependencies between nucleotide occurrences in different site positions. Currently, two recently proposed models, BaMM and InMoDe, can do as much. However, application of these models was usually limited only to comparing their recognition accuracies with that of PWMs, while none of the analyses of the co-prediction and relative positioning of hits of different models in peaks has yet been performed. To close this gap, we propose the pipeline called MultiDeNA. This pipeline includes stages of model training, assessing their recognition accuracy, scanning ChIP-seq peaks and their classif ication based on scan results. We applied our pipeline to 22 ChIP-seq datasets of TF FOXA2 and considered PWM, dinucleotide PWM (diPWM), BaMM and InMoDe models. The combination of these four models allowed a signif icant increase in the fraction of recognized peaks compared to that for the sole PWM model: the increase was 26.3 %. The BaMM model provided the main contribution to the recognition of sites. Although the major fraction of predicted peaks contained TFBS of different models with coincided positions, the medians of the fraction of peaks containing the predictions of sole models were 1.08, 0.49, 4.15 and 1.73 % for PWM, diPWM, BaMM and InMoDe, respectively. Thus, FOXA2 BSs were not fully described by only a sole model, which indicates theirs heterogeneity. We assume that the BaMM model is the most successful in describing the structure of the FOXA2 BS in ChIP-seq datasets under study.
Position weight matrix (PWM) is the traditional motif model representing the transcription factor (TF) binding sites. It proposes that the positions contribute independently to TFs binding affinity, although this hypothesis does not fit the data perfectly. This explains why PWM hits are missing in a substantial fraction of ChIP-seq peaks. To study various modes of the direct binding of plant TFs, we compiled the benchmark collection of 111 ChIP-seq datasets for Arabidopsis thaliana, and applied the traditional PWM, and two alternative motif models BaMM and SiteGA, proposing the dependencies of the positions. The variation in the stringency of the recognition thresholds for the models proposed that the hits of PWM, BaMM, and SiteGA models are associated with the sites of high/medium, any, and low affinity, respectively. At the medium recognition threshold, about 60% of ChIP-seq peaks contain PWM hits consisting of conserved core consensuses, while BaMM and SiteGA provide hits for an additional 15% of peaks in which a weaker core consensus is compensated through intra-motif dependencies. The presence/absence of these dependencies in the motifs of alternative/traditional models was confirmed by the dependency logo DepLogo visualizing the position-wise partitioning of the alignments of predicted sites. We exemplify the detailed analysis of ChIP-seq profiles for plant TFs CCA1, MYC2, and SEP3. Gene ontology (GO) enrichment analysis revealed that among the three motif models, the SiteGA had the highest portions of genes with the significantly enriched GO terms among all predicted genes. We showed that both alternative motif models provide for traditional PWM greater extensions in predicted sites for TFs MYC2/SEP3 with condition/tissue specific functions, compared to those for TF CCA1 with housekeeping functions. Overall, the combined application of standard and alternative motif models is beneficial to detect various modes of the direct TF-DNA interactions in the maximal portion of ChIP-seq loci.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.