Background
It is of great importance for researchers to publish research results in high-quality journals. However, it is often challenging to choose the most suitable publication venue, given the exponential growth of journals and conferences. Although recommender systems have achieved success in promoting movies, music, and products, very few studies have explored recommendation of publication venues, especially for biomedical research. No recommender system exists that can specifically recommend journals in PubMed, the largest collection of biomedical literature.
Objective
We aimed to propose a publication recommender system, named Pubmender, to suggest suitable PubMed journals based on a paper’s abstract.
Methods
In Pubmender, pretrained word2vec was first used to construct the start-up feature space. Subsequently, a deep convolutional neural network was constructed to achieve a high-level representation of abstracts, and a fully connected softmax model was adopted to recommend the best journals.
Results
We collected 880,165 papers from 1130 journals in PubMed Central and extracted abstracts from these papers as an empirical dataset. We compared different recommendation models such as Cavnar-Trenkle on the Microsoft Academic Search (MAS) engine, a collaborative filtering–based recommender system for the digital library of the Association for Computing Machinery (ACM) and CiteSeer. We found the accuracy of our system for the top 10 recommendations to be 87.0%, 22.9%, and 196.0% higher than that of MAS, ACM, and CiteSeer, respectively. In addition, we compared our system with Journal Finder and Journal Suggester, which are tools of Elsevier and Springer, respectively, that help authors find suitable journals in their series. The results revealed that the accuracy of our system was 329% higher than that of Journal Finder and 406% higher than that of Journal Suggester for the top 10 recommendations. Our web service is freely available at https://www.keaml.cn:8081/.
Conclusions
Our deep learning–based recommender system can suggest an appropriate journal list to help biomedical scientists and clinicians choose suitable venues for their papers.
G2PDeep is an open-access web server, which provides a deep-learning framework for quantitative phenotype prediction and discovery of genomics markers. It uses zygosity or single nucleotide polymorphism (SNP) information from plants and animals as the input to predict quantitative phenotype of interest and genomic markers associated with phenotype. It provides a one-stop-shop platform for researchers to create deep-learning models through an interactive web interface and train these models with uploaded data, using high-performance computing resources plugged at the backend. G2PDeep also provides a series of informative interfaces to monitor the training process and compare the performance among the trained models. The trained models can then be deployed automatically. The quantitative phenotype and genomic markers are predicted using a user-selected trained model and the results are visualized. Our state-of-the-art model has been benchmarked and demonstrated competitive performance in quantitative phenotype predictions by other researchers. In addition, the server integrates the soybean nested association mapping (SoyNAM) dataset with five phenotypes, including grain yield, height, moisture, oil, and protein. A publicly available dataset for seed protein and oil content has also been integrated into the server. The G2PDeep server is publicly available at http://g2pdeep.org. The Python-based deep-learning model is available at https://github.com/shuaizengMU/G2PDeep_model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.