Tree boosting, which combines weak learners (typically decision trees) to generate a strong learner, is a highly effective and widely used machine learning method. However, the development of a high performance tree boosting model is a time-consuming process that requires numerous trial-and-error experiments. To tackle this issue, we have developed a visual diagnosis tool, BOOSTVis, to help experts quickly analyze and diagnose the training process of tree boosting. In particular, we have designed a temporal confusion matrix visualization, and combined it with a t-SNE projection and a tree visualization. These visualization components work together to provide a comprehensive overview of a tree boosting model, and enable an effective diagnosis of an unsatisfactory training process. Two case studies that were conducted on the Otto Group Product Classification Challenge dataset demonstrate that BOOSTVis can provide informative feedback and guidance to improve understanding and diagnosis of tree boosting algorithms.
In many applications, ideas that are described by a set of words often flow between different groups. To facilitate users in analyzing the flow, we present a method to model the flow behaviors that aims at identifying the lead-lag relationships between word clusters of different user groups. In particular, an improved Bayesian conditional cointegration based on dynamic time warping is employed to learn links between words in different groups. A tensor-based technique is developed to cluster these linked words into different clusters (ideas) and track the flow of ideas. The main feature of the tensor representation is that we introduce two additional dimensions to represent both time and lead-lag relationships. Experiments on both synthetic and real datasets show that our method is more effective than methods based on traditional clustering techniques and achieves better accuracy. A case study was conducted to demonstrate the usefulness of our method in helping users understand the flow of ideas between different user groups on social media.
Analysis of the spatial variations in river networks and the related influencing factors is crucial for the management and protection of basins. To gain insight into the spatial variations and influencing factors of river networks between large basins, in this study, three river basins from north to south in China (Songhua River Basin, Yellow River Basin and Pearl River Basin) were selected for investigation. First, based on a digital elevation model, different river networks with six drainage accumulation thresholds of three basins were extracted using ArcGIS. The optimal networks were determined through fitting the relationship between the accumulation threshold and related drainage density. Then, we used two indicators, drainage density and water surface ratio, to characterize the spatial variations of three basins. Finally, Pearson’s correlation coefficients were calculated between those two indicators and natural/human influencing factors. The results showed that drainage density and water surface ratio decreased from north to south in China and were negatively correlated with natural/human influencing factors. Drainage density was more influenced by natural factors than by human factors, while the opposite was true for water surface ratio. These findings may provide some basis for the management and protection of the river network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.