The three‐dimensional structure of proteins is useful to carry out the biophysical and biochemical functions in a cell. Approaches to protein structure/fold prediction typically extract amino acid sequence features, and machine learning approaches are then applied to classification problem. Protein contact maps are two‐dimensional representations of the contacts among the amino acid residues in the folded protein structure. This paper highlights the need for a systematic study of these contact networks. Mining of contact maps to derive features pertaining to fold information offers a new mechanism for fold discovery from the protein sequence via the contact maps. These ideas are explored in the structural class of all‐alpha proteins to identify structural elements. A simple and computationally inexpensive algorithm based on triangle subdivision method is proposed to extract additional features from the contact map. The method successfully characterizes the off‐diagonal interactions in the contact map for predicting specific ‘folds’. The decision tree classification results show great promise in developing a new and simple tool for the challenging problem of fold prediction. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 362–368 DOI: 10.1002/widm.35This article is categorized under: Algorithmic Development > Biological Data Mining Technologies > Classification Technologies > Machine Learning
The problem of choosing a team for a given project/task with minimum communication cost is known as team formation problem for which many algorithms have been proposed in the literature. The skill-centric algorithms in the literature start by searching for suitable experts for each skill. These algorithms are very slow as the search requires shortest path calculations. We propose that by considering the topology of the underlying social network, the algorithms can be made more efficient. We contribute two algorithms in this paper for team formation, namely, TPLRandom and TPLClosest which exploit the power law of the degree distribution of the social network to form a team. The proposed algorithm is based on the idea that is generally adopted while a team is being formed for the real world challenges. A team leader is identified first who then sets about choosing the team members possessing the necessary skills required for the task. The algorithms choose high degree nodes from the heavy tail of the degree distribution to act as leaders. The leaders form teams from their own neighbourhoods and the one with the lowest communication cost is chosen as the best team. This is an entirely novel approach to team formation problem. We show that these high degree experts and their neighbours cover a large number of skills required for the task, reducing the expensive computations and thus yielding a fast and scalable algorithm. The experimentation is carried out on the well-known DBLP data set. We build a much larger benchmark data set from DBLP for experimentation. Our algorithms TPLClosest and TPLRandom provide teams with significantly lower communication costs. They also surpass the other conventional algorithms such as MinLD and MinSD in terms of the execution time.
Given a graph G = (V, E), the problem of Graph Burning is to find a sequence of nodes from V , called burning sequence, in order to burn the whole graph. This is a discrete-step process, in each step an unburned vertex is selected as an agent to spread fire to its neighbors by marking it as a burnt node. A node that is burnt spreads the fire to its neighbors at the next consecutive step. The goal is to find the burning sequence of minimum length. The Graph Burning problem is NP-Hard for general graphs and even for binary trees. A few approximation results are known, including a 3-approximation algorithm for general graphs and a 2-approximation algorithm for trees. In this paper, we propose an approximation algorithm for trees that produces a burning sequence of length at most 1.75b(T ) + 1, where b(T ) is length of the optimal burning sequence, also called the burning number of the tree T . In other words, we achieve an approximation factor of ( 1.75b(T ) + 1)/b(T ).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.