Data plays a key role in the design of expert and intelligent systems and therefore, data preprocessing appears to be a critical step to produce high-quality data and build accurate machine learning models. Over the past decades, increasing attention has been paid towards the issue of class imbalance and this is now a research hotspot in a variety of fields. Although the resampling methods, either by undersampling the majority class or by over-sampling the minority class, stand among the most powerful techniques to face this problem, their strengths and weaknesses have typically been discussed based only on the class imbalance ratio. However, several questions remain open and need further exploration. For instance, the subtle differences in performance between the over-and under-sampling algorithms are still under-comprehended, and we hypothesize that they could be better explained by analyzing the inner structure of the data sets. Consequently, this paper attempts to investigate and illustrate the effects of the resampling methods on the inner structure of a data set by exploiting local neighborhood information, identifying the sample types in both classes and analyzing their distribution in each resampled set. Experimental results indicate that the resampling methods that pro
Project portfolio selection is one of the most important problems faced by any organization. The decision process involves multiple conflicting criteria, and has been commonly addressed by implementing a two-phase procedure. The first step identifies the efficient solution set; the second step supports the decision maker in selecting only one portfolio solution from the efficient set. However, several recent studies show the advantages gained by optimizing towards a region of interest (according to the decision maker's preferences) instead of approximating the complete Pareto set. However, these works have not faced synergism and its variants, such as cannibalization and redundancy. In this paper we introduce a new approach called Non-Outranked Ant Colony Optimization, which optimizes interdependent project portfolios with a priori articulation of decision-maker preferences based on an outranking model. Several experimental tests show the advantages of our proposal over the two-phase approach, providing reasonable evidence of its potential for solving real-world high-scale problems with many objectives.
In this paper, we develop and apply a genetic algorithm to solve surgery scheduling cases in a Mexican Public Hospital. Here, one of the most challenging issues is to process containers with heterogeneous capacity. Many scheduling problems do not share this restriction; because of this reason, we developed and implemented a strategy for the processing of heterogeneous containers in the genetic algorithm. The final product was named “genetic algorithm for scheduling optimization” (GAfSO). The results of GAfSO were tested with real data of a local hospital. Said hospital assigns different operational time to the operating rooms throughout the week. Also, the computational complexity of GAfSO is analyzed. Results show that GAfSO can assign the corresponding capacity to the operating rooms while optimizing their use.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.