Bioluminescent proteins (BLPs) are a class of proteins that widely distributed in many living organisms with various mechanisms of light emission including bioluminescence and chemiluminescence from luminous organisms. Bioluminescence has been commonly used in various analytical research methods of cellular processes, such as gene expression analysis, drug discovery, cellular imaging, and toxicity determination. However, the identification of bioluminescent proteins is challenging as they share poor sequence similarities among them. In this paper, we briefly reviewed the development of the computational identification of BLPs and subsequently proposed a novel predicting framework for identifying BLPs based on eXtreme gradient boosting algorithm (XGBoost) and using sequence-derived features. To train the models, we collected BLP data from bacteria, eukaryote, and archaea. Then, for getting more effective prediction models, we examined the performances of different feature extraction methods and their combinations as well as classification algorithms. Finally, based on the optimal model, a novel predictor named iBLP was constructed to identify BLPs. The robustness of iBLP has been proved by experiments on training and independent datasets. Comparison with other published method further demonstrated that the proposed method is powerful and could provide good performance for BLP identification. The webserver and software package for BLP identification are freely available at http://lin-group.cn/server/iBLP.
The locations of the initiation of genomic DNA replication are defined as origins of replication sites (ORIs), which regulate the onset of DNA replication and play significant roles in the DNA replication process. The study of ORIs is essential for understanding the cell-division cycle and gene expression regulation. Accurate identification of ORIs will provide important clues for DNA replication research and drug development by developing computational methods. In this paper, the first integrated predictor named iORI-Euk was built to identify ORIs in multiple eukaryotes and multiple cell types. In the predictor, seven eukaryotic (Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana, Pichia pastoris, Schizosaccharomyces pombe and Kluyveromyces lactis) ORI data was collected from public database to construct benchmark datasets. Subsequently, three feature extraction strategies which are k-mer, binary encoding and combination of k-mer and binary were used to formulate DNA sequence samples. We also compared the different classification algorithms’ performance. As a result, the best results were obtained by using support vector machine in 5-fold cross-validation test and independent dataset test. Based on the optimal model, an online web server called iORI-Euk (http://lin-group.cn/server/iORI-Euk/) was established for the novel ORI identification.
The rapid spread of SARS-CoV-2 infection around the globe has caused a massive
health and socioeconomic crisis. Identification of phosphorylation sites is an
important step for understanding the molecular mechanisms of SARS-CoV-2
infection and the changes within the host cells pathways. In this study, we
present DeepIPs, a first specific deep-learning architecture to identify
phosphorylation sites in host cells infected with SARS-CoV-2. DeepIPs consists
of the most popular word embedding method and convolutional neural network-long
short-term memory network architecture to make the final prediction. The
independent test demonstrates that DeepIPs improves the prediction performance
compared with other existing tools for general phosphorylation sites prediction.
Based on the proposed model, a web-server called DeepIPs was established and is
freely accessible at
http://lin-group.cn/server/DeepIPs
. The source code of DeepIPs
is freely available at the repository
https://github.com/linDing-group/DeepIPs
.
The global pandemic of coronavirus disease 2019 (COVID-19), caused by severe
acute respiratory syndrome coronavirus 2, has led to a dramatic loss of human
life worldwide. Despite many efforts, the development of effective drugs and
vaccines for this novel virus will take considerable time. Artificial
intelligence (AI) and machine learning (ML) offer promising solutions that could
accelerate the discovery and optimization of new antivirals. Motivated by this,
in this paper, we present an extensive survey on the application of AI and ML
for combating COVID-19 based on the rapidly emerging literature. Particularly,
we point out the challenges and future directions associated with
state-of-the-art solutions to effectively control the COVID-19 pandemic. We hope
that this review provides researchers with new insights into the ways AI and ML
fight and have fought the COVID-19 outbreak.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.