One of the important modes of pre-mRNA post-transcriptional modification is alternative splicing. Alternative splicing allows creation of many distinct mature mRNA transcripts from a single gene by utilizing different splice sites. In plants like Arabidopsis thaliana, the most common type of alternative splicing is intron retention. Many studies in the past focus on positional distribution of retained introns (RIs) among different genic regions and their expression regulations, while little systematic classification of RIs from constitutively spliced introns (CSIs) has been conducted using machine learning approaches. We used random forest and support vector machine (SVM) with radial basis kernel function (RBF) to differentiate these two types of introns in Arabidopsis. By comparing coordinates of introns of all annotated mRNAs from TAIR10, we obtained our high-quality experimental data. To distinguish RIs from CSIs, We investigated the unique characteristics of RIs in comparison with CSIs and finally extracted 37 quantitative features: local and global nucleotide sequence features of introns, frequent motifs, the signal strength of splice sites, and the similarity between sequences of introns and their flanking regions. We demonstrated that our proposed feature extraction approach was more accurate in effectively classifying RIs from CSIs in comparison with other four approaches. The optimal penalty parameter C and the RBF kernel parameter in SVM were set based on particle swarm optimization algorithm (PSOSVM). Our classification performance showed F-Measure of 80.8% (random forest) and 77.4% (PSOSVM). Not only the basic sequence features and positional distribution characteristics of RIs were obtained, but also putative regulatory motifs in intron splicing were predicted based on our feature extraction approach. Clearly, our study will facilitate a better understanding of underlying mechanisms involved in intron retention.
Sheep detection and segmentation will play a crucial role in promoting the implementation of precision livestock farming in the future. In sheep farms, the characteristics of sheep that have the tendency to congregate and irregular contours cause difficulties for computer vision tasks, such as individual identification, behavior recognition, and weight estimation of sheep. Sheep instance segmentation is one of the methods that can mitigate the difficulties associated with locating and extracting different individuals from the same category. To improve the accuracy of extracting individual sheep locations and contours in the case of multiple sheep overlap, this paper proposed two-stage sheep instance segmentation SheepInst based on the Mask R-CNN framework, more specifically, RefineMask. Firstly, an improved backbone network ConvNeXt-E was proposed to extract sheep features. Secondly, we improved the structure of the two-stage object detector Dynamic R-CNN to precisely locate highly overlapping sheep. Finally, we enhanced the segmentation network of RefineMask by adding spatial attention modules to accurately segment irregular contours of sheep. SheepInst achieves 89.1%, 91.3%, and 79.5% in box AP, mask AP, and boundary AP metric on the test set, respectively. The extensive experiments show that SheepInst is more suitable for sheep instance segmentation and has excellent performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.