Interval-valued data (IVD) is a kind of data where each feature is an interval. The midpoint and boundary are the two commonly used methods for representing IVD. However, their structure information (such as location, size) may be incomplete because only midpoint or endpoint is adopted which will lead to poor results of data processing. To better depict the structural information of IVD, a unified representation frame (URF) for IVD is proposed. It not only takes into account the size and location information, but the relationship between them as well. This frame can also represent the midpoint and boundary methods in a unified way. Besides, symmetrical uncertainty (SU) is adopted to measure the relationship between features and classes quantitatively, and irrelevant features will be eliminated based on SU. The proposed URF_ SU is applied in some traditional classifiers like LIBSVM, CART Tree and KNN. The experimental results on synthetic and real-world datasets demonstrate that the proposed approach is more effective than other representation methods of IVD in classification tasks. INDEX TERMS Interval-valued data, unified representation frame, symmetrical uncertainty, feature selection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.