Motivation: While profile hidden Markov models (HMMs) are successful and powerful methods to recognize homologous proteins, they can break down when homology becomes too distant due to lack of sufficient training data. We show that we can improve the performance of HMMs in this domain by using a simple simulated model of evolution to create an augmented training set.Results: We show, in two different remote protein homolog tasks, that HMMs whose training is augmented with simulated evolution outperform HMMs trained only on real data. We find that a mutation rate between 15 and 20% performs best for recognizing G-protein coupled receptor proteins in different classes, and for recognizing SCOP super-family proteins from different families.Contacts: anoop.kumar@tufts.edu;lenore.cowen@tufts.edu
In recent years, Twitter has become one of the most important modes for social networking and disseminating content on a variety of topics. It has developed into a popular medium for political discourse and social organization during elections. There has been growing body of literature demonstrating the ability to predict the outcome of elections from Twitter data.This works aims to test the predictive power of Twitter in inferring the winning candidate and vote percentages of the candidates in an election. Our prediction is based on the number of times the name of a candidate is mentioned in tweets prior to elections. We develop new methods to augment the counts by counting not only the presence of candidate's official names but also their aliases and commonly appearing names. In addition, we devised a technique to include relevant and filter irrelevant tweets based on predefined set of keywords. Our approach is successful in predicting the winner of all three presidential elections held in Latin America during the months of
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.