Medical predictive modeling is a challenging problem due to the heterogeneous nature of the patients. In order to build effective medical predictive models we need to address such heterogeneous nature during modeling and allow patients to have their own personalized models instead of using a one-size-fits-all model. However, building a personalized model for each patient is computationally expensive and the over-parametrization of the model makes it susceptible to the model overfitting problem. To address these challenges, we propose a novel approach called FactORized MUlti-task LeArning model (Formula), which learns the personalized model of each patient via a sparse multi-task learning method. The personalized models are assumed to share a low-rank representation, known as the base models. Formula is designed to simultaneously learn the base models as well as the personalized model of each patient, where the latter is a linear combination of the base models. We have performed extensive experiments to evaluate the proposed approach on a real medical data set. The proposed approach delivered superior predictive performance while the personalized models offered many useful medical insights.
Collaborative filtering has been widely used in modern recommender systems to provide accurate recommendations by leveraging historical interactions between users and items. The presence of cold-start items and users has imposed a huge challenge to recommender systems based on collaborative filtering, because of the unavailability of such interaction information. The factorization machine is a powerful tool designed to tackle the cold-start problems by learning a bilinear ranking model that utilizes content information about users and items, exploiting the interactions with such content information. While a factorization machine makes use of all possible interactions between all content features to make recommendations, many of the features and their interactions are not predictive of recommendations, and incorporating them in the model will deteriorate the generalization performance of the recommender systems. In this paper, we propose an efficient Sparse Factorization Machine (SFM), that simultaneously identifies relevant user and item content features, models interactions between these relevant features, and learns a bilinear model using only these synergistic interactions. We have carried out extensive empirical studies on both synthetic and real-world datasets, and compared our method to other state-of-the-art baselines, including Factorization Machine. Experimental results show that SFM can greatly outperform other baselines.
In climate and environmental sciences, vast amount of spatio-temporal data have been generated at varying spatial resolutions from satellite observations and computer models. Integrating such diverse sources of data has proven to be useful for building prediction models as the multi-scale data may capture different aspects of the Earth system. In this paper, we present a novel framework called MUSCAT for predictive modeling of multi-scale, spatio-temporal data. MUSCAT performs a joint decomposition of multiple tensors from different spatial scales, taking into account the relationships between the variables. The latent factors derived from the joint tensor decomposition are used to train the spatial and temporal prediction models at different scales for each location. The outputs from these ensemble of spatial and temporal models will be aggregated to generate future predictions. An incremental learning algorithm is also proposed to handle the massive size of the tensors. Experimental results on real-world data from the United States Historical Climate Network (USHCN) showed that MUSCAT outperformed other competing methods in more than 70\% of the locations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.