This study examines whether there are some general trends across subject fields regarding the factors affecting the number of citations of articles, focusing especially on those factors that are not directly related to the quality or content of articles (extrinsic factors). For this purpose, from 6 selected subject fields (condensed matter physics, inorganic and nuclear chemistry, electric and electronic engineering, biochemistry and molecular biology, physiology, and gastroenterology), original articles published in the same year were sampled (n = 230-240 for each field). Then, the citation counts received by the articles in relatively long citation windows (6 and 11 years after publication) were predicted by negative binomial multiple regression (NBMR) analysis for each field. Various article features about author collaboration, cited references, visibility, authors' achievements (measured by past publications and citedness), and publishing journals were considered as the explanatory variables of NBMR. Some generality across the fields was found with regard to the selected predicting factors and the degree of significance of these predictors. The Price index was the strongest predictor of citations, and number of references was the next. The effects of number of authors and authors' achievement measures were rather weak.
This paper proposes a methodology which discriminates the articles by the target authors ("true" articles) from those by other homonymous authors ("false" articles). Author name searches for 2,595 "source" authors in six subject fields retrieved about 629,000 articles. In order to extract true articles from the large amount of the retrieved articles, including many false ones, two filtering stages were applied. At the first stage any retrieved article was eliminated as false if either its affiliation addresses had little similarity to those of its source article or there was no citation relationship between the journal of the retrieved article and that of its source article. At the second stage, a sample of retrieved articles was subjected to manual judgment, and utilizing the judgment results, discrimination functions based on logistic regression were defined. These discrimination functions demonstrated both the recall ratio and the precision of about 95% and the accuracy (correct answer ratio) of 90-95%. Existence of common coauthor(s), address similarity, title words similarity, and interjournal citation relationships between the retrieved and source articles were found to be the effective discrimination predictors. Whether or not the source author was from a specific country was also one of the important predictors. Furthermore, it was shown that a retrieved article is almost certainly true if it was cited by, or cocited with, its source article. The method proposed in this study would be effective when dealing with a large number of articles whose subject fields and affiliation addresses vary widely.
A bibliometric approach was used to survey the state-of-the-art of research in the field of chemical information and computer sciences (CICS). By examining the CA database for the articles abstracted under the subsection "Chemical information, documentation, and data processing", Journal of Chemical Information and Computer Sciences (JCICS) was identified to have been the top journal in this subsection for the last 30 years. Based on this result, CA subsections and controlled index terms given to JCICS articles were analyzed to see trends in subjects and topics in the CICS field during the last two decades. These analyses revealed that the subjects of research in CICS have diversified from traditional information science and computer applications to chemistry to "molecular information sciences". The SCISEARCH database was used to grasp interdependency between JCICS and other key journals and also the international nature of JCICS in its publications and citedness.
The Carcinogenicity Reliability Database (CRDB) was constructed by collecting experimental carcinogenicity data on about 1,500 chemicals from six sources, including IARC, and NTP databases, and then by ranking their reliabilities into six unified categories. A wide variety of 911 organic chemicals were selected from the database for QSAR modeling, and 1,504 kinds of different molecular descriptors were calculated, based on their 3D molecular structures as modeled by the Dragon software. Positive (carcinogenic) and negative (non-carcinogenic) chemicals containing various substructures were counted using atom and functional group count descriptors, and the statistical significance of ratios of positives to negatives was tested for those substructures. Very few were judged to be strongly related to carcinogenicity, among substructures known to be responsible for carcinogens as revealed from biomedical studies. In order to develop QSAR models for the prediction of the carcinogenicities of a wide variety of chemicals with a satisfactory performance level, the relationship between the carcinogenicity data with improved reliability and a subset of significant descriptors selected from 1,504 Dragon descriptors was analyzed with a support vector machine (SVM) method: the classification function (SVC) for weighted data in LIBSVM program was used to classify chemicals into two carcinogenic categories (positive or negative), where weights were set depending on the reliabilities of the carcinogenicity data. The quality and stability of the models presented were tested by performing a dual cross-validation procedure. A single SVM model as the first step was developed for all the 911 chemicals using 250 selected descriptors, achieving an overall accuracy level, i.e., positive and negative correct estimate, of about 70%. In order to improve the accuracy of the final model, the 911 chemicals were classified into 20 mutually overlapping subgroups according to contained substructures, a specific SVM model was optimized for each subgroup, and the predicted carcinogenicities of the 911 chemicals were determined by the majorities of the outputs of the corresponding SVM models. The model developed on the basis of grouping of chemicals into 20 substructures predicts the carcinogenicities of a wide variety of chemicals with a satisfactory overall accuracy of approximately 80%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.