A number of feature selection metrics have been explored in text categorization, among which information gain (IG), chi-square (CHI), correlation coefficient (CC) and odds ratios (OR) are considered most effective. CC and OR are one-sided metrics while IG and CHI are two-sided. Feature selection using one-sided metrics selects the features most indicative of membership only, while feature selection using two-sided metrics implicitly combines the features most indicative of membership (e.g. positive features) and nonmembership (e.g. negative features) by ignoring the signs of features. The former never consider the negative features, which are quite valuable, while the latter cannot ensure the optimal combination of the two kinds of features especially on imbalanced data. In this work, we investigate the usefulness of explicit control of that combination within a proposed feature selection framework. Using multinomial naïve Bayes and regularized logistic regression as classifiers, our experiments show both great potential and actual merits of explicitly combining positive and negative features in a nearly optimal fashion according to the imbalanced data.
Recently Non-negative Matrix Factorization (NMF)
Clustering on multi-type relational data has attracted more and more attention in recent years due to its high impact on various important applications, such as Web mining, e-commerce and bioinformatics. However, the research on general multi-type relational data clustering is still limited and preliminary. The contribution of the paper is three-fold. First, we propose a general model, the collective factorization on related matrices, for multi-type relational data clustering. The model is applicable to relational data with various structures. Second, under this model, we derive a novel algorithm, the spectral relational clustering, to cluster multi-type interrelated data objects simultaneously. The algorithm iteratively embeds each type of data objects into low dimensional spaces and benefits from the interactions among the hidden structures of different types of data objects. Extensive experiments demonstrate the promise and effectiveness of the proposed algorithm. Third, we show that the existing spectral clustering algorithms can be considered as the special cases of the proposed model and algorithm. This demonstrates the good theoretic generality of the proposed model and algorithm.
We compare three methods for computing invariant Lyapunov exponents (LEs) in general relativity. They involve the geodesic deviation vector technique (M1), the two-nearby-orbits method with projection operations and with coordinate time as an independent variable (M2), and the two-nearby-orbits method without projection operations and with proper time as an independent variable (M3). An analysis indicates that M1 and M3 do not need any projection operation. In general, the values of LEs from the three methods are almost the same. As an advantage, M3 is simpler to use than M2. In addition, we propose to construct the invariant fast Lyapunov indictor (FLI) with two-nearby-trajectories and give its algorithm in order to quickly distinguish chaos from order. Taking a static axisymmetric spacetime as a physical model, we apply the invariant FLIs to explore the global dynamics of phase space of the system where regions of chaos and order are clearly identified.PACS numbers: 95.10. Fh, 95.30.Sf
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.