The k -nearest neighbor ( k -NN) classifier is one of the most widely used methods of classification due to several interesting features, including good generalization and easy implementation. Although simple, it is usually able to match and even outperform more sophisticated and complex methods. One of the problems with this approach is fixing the appropriate value of k . Although a good value might be obtained using cross validation, it is unlikely that the same value could be optimal for the whole space spanned by the training set. It is evident that different regions of the feature space would require different values of k due to the different distributions of prototypes. The situation of a query instance in the center of a class is very different from the situation of a query instance near the boundary between two classes. In this brief, we present a simple yet powerful approach to setting a local value of k . We associate a potentially different k to every prototype and obtain the best value of k by optimizing a criterion consisting of the local and global effects of the different k values in the neighborhood of the prototype. The proposed method has a fast training stage and the same complexity as the standard k -NN approach at the testing stage. The experiments show that this simple approach can significantly outperform the standard k -NN rule for both standard and class-imbalanced problems in a large set of different problems.
The current social impact of new technologies has produced major changes in all areas of society, creating the concept of a smart city supported by an electronic infrastructure, telecommunications and information technology. This paper presents a review of Bluetooth Low Energy (BLE), Near Field Communication (NFC) and Visible Light Communication (VLC) and their use and influence within different areas of the development of the smart city. The document also presents a review of Big Data Solutions for the management of information and the extraction of knowledge in an environment where things are connected by an “Internet of Things” (IoT) network. Lastly, we present how these technologies can be combined together to benefit the development of the smart city.
The great size of chemical databases and the high computational cost required in the atom-atom comparison of molecular structures for the calculation of the similarity between two chemical compounds necessitate the proposal of new clustering models with the aim of reducing the time of recovery of a set of molecules from a database that satisfies a range of similarities with regard to a given molecule pattern. In this paper we make use of the information corresponding to the cycles existing in the structure of molecules as an approach for the classification of chemical databases. The clustering method here proposed is based on the representation of the topological structure of molecules stored in chemical databases through its corresponding cycle graph. This method presents a more appropriate behavior for others described in the bibliography in which the information corresponding to the cyclicity of the molecules is also used.
In this paper we propose a new algorithm for subgraph isomorphism based on the representation of molecular structures as colored graphs and the representation of these graphs as vectors in n-dimensional spaces. The presented process that obtains all maximum common substructures is based on the solution of a constraint satisfaction problem defined as the common m-dimensional space (m< or =n) in which the vectors representing the matched graphs can be defined.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.