BackgroundProtein-protein interaction (PPI) extraction from published scientific articles is one key issue in biological research due to its importance in grasping biological processes. Despite considerable advances of recent research in automatic PPI extraction from articles, demand remains to enhance the performance of the existing methods.ResultsOur feature-based method incorporates the strength of many kinds of diverse features, such as lexical and word context features derived from sentences, syntactic features derived from parse trees, and features using existing patterns to extract PPIs automatically from articles. Among these abundant features, we assemble the related features into four groups and define the contribution level (CL) for each group, which consists of related features. Our method consists of two steps. First, we divide the training set into subsets based on the structure of the sentence and the existence of significant keywords (SKs) and apply the sentence patterns given in advance to each subset. Second, we automatically perform feature selection based on the CL values of the four groups that consist of related features and the k-nearest neighbor algorithm (k-NN) through three approaches: (1) focusing on the group with the best contribution level (BEST1G); (2) unoptimized combination of three groups with the best contribution levels (U3G); (3) optimized combination of two groups with the best contribution levels (O2G).ConclusionsOur method outperforms other state-of-the-art PPI extraction systems in terms of F-score on the HPRD50 corpus and achieves promising results that are comparable with these PPI extraction systems on other corpora. Further, our method always obtains the best F-score on all the corpora than when using k-NN only without exploiting the CLs of the groups of related features.
For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as “bind” or “interact” plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimental results using five corpora show the effectiveness of our proposed method with dominant keyword prediction.
Plastic pollution is a matter of deep concern that requires an urgent and international response, involving stakeholders at all levels. The rapid increase of single-use plastic and medical waste, especially in the context of COVID-19, has caused a drastic progression in the plastic pollution crisis on a global scale. To identify an efficient plastic waste management (PWM) system to tackle this major environmental problem, this study adopted importance-performance analysis and used logistic regression to identify key factors affecting citizens’ behavior to participate in PWM strategies in Vietnam. The results indicate that while the importance of all PWM solutions was considered to be high, their performance was rated at a low level, implying a sizable gap between perceived importance and performance of eleven solutions for PWM. The findings also show that solutions such as “offering zero-waste lifestyle seminars to citizens”, “having community engagement”, “using eco-friendly products”, and “imposing a ban on single-use plastics” are useful for the development of an effective environmental policy. Furthermore, it was found that the following characteristics have a significant influence on citizens’ participation in PWM solutions: (1) gender, (2) education level, (3) residential area, (4) employment status, and (5) citizens’ awareness and behavior towards plastic reduction. This study is expected to provide theoretical and empirical evidence for policymakers and authorities who are in charge of promulgating the necessary mechanisms and policies to promote the socialization of PWM.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.