Text Classification (or Text Categorization) is a popular machine learning task. It consists in assigning categories to documents. In this paper, we are interested in comparing state of the art classifiers and state of the art feature weights. Feature weight methods are classic tools that are used in text categorization. We extend previous studies by evaluating numerous term weighting schemes for state of the art classification methods. We aim at providing a complete survey on text classification for fair benchmark comparisons.
In text classification, terms are given weights using Term Weighting Scheme (TWS) in order to improve classification performance. Multi-label classification task are generaly simplified into several single-label binary task. Thus, the term distribution are considered only in terms of positive and negative categories. In this paper, we propose a new TWS based on the information gain measure for multi-label classification task. This TWS try to overcome this shortness without affecting the complexity of the problem. In this paper, we examine our proposed TWS with eight well-known TWS on two popular problems using 5 learning algorithms. From our experimental results, our new proposed method outperforms other methods specialy regarding the macro-averaging measure.
Abstract. Monte-Carlo Tree Search (MCTS) is a popular technique for playing multi-player games. In this paper, we propose a new method to bias the playout policy of MCTS. The idea is to prune the decisions which seem "bad" (according to the previous iterations of the algorithm) before computing each playout. Thus, the method evaluates the estimated "good" moves more precisely. We have tested our improvement for the game of Havannah and compared it to several classic improvements. Our method outperforms the classic version of MCTS (with the RAVE improvement) and the different playout policies of MCTS that we have experimented.
Term-Weighting Scheme (TWS) is an important step in text classification. It determines how documents are represented in Vector Space Model (VSM). Even though state-of-the-art TWSs exhibit good behaviors, a large number of new works propose new approaches and new TWSs that improve performances. Furthermore, it is still difficult to tell which TWS is well suited for a specific problem. In this paper, we are interested in automatically generating new TWSs with the help of evolutionary algorithms and especially genetic programming (GP). GP evolves and combines different statistical information and generates a new TWS based on the performance of the learning method. We experience the generated TWSs on three well-known benchmarks. Our study shows that even early generated formulas are quite competitive with the state-of-the-art TWSs and even in some cases outperform them.
Accurate demand forecasting has always been essential for retailers in order to be able to survive in the highly competitive, volatile modern market. However, anticipating product demand is an extremely difficult task in the context of short product life cycles in which consumer demand is influenced by many heterogeneous variables. During the COVID-19 pandemic in particular, with all its related new constraints, the fashion industry has seen a huge decline in sales, which makes it difficult for existing sales forecasting methods to accurately predict new product sales. This paper proposes an original sales forecasting framework capable of considering the effect of the COVID-19 related crisis on sales. The proposed framework combines clustering, classification, and regression. The main goals of this framework are (1) to predict a sales pattern for each item based on its attributes and (2) to correct it by modelling the impact of the crisis on sales. We evaluate our proposed framework using a real-world dataset of a French fashion retailer with Omnichannel sales. Despite the fact that during the lockdown period online sales were still possible, consumer purchases were significantly impacted by this crisis. Experimental analysis show that our methodology learns the impact of the crisis on consumer behavior from online sales, and then, adapts the sales forecasts already obtained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.