Abstract-This work aims at minimize the cost of answering snapshot multi-predicate queries in high-communication-cost networks. High-communication-cost (HCC) networks is a family of networks where communicating data is very demanding in resources, for example in wireless sensor networks transmitting data drains the battery life of sensors involved. The important class of multi-predicate queries in horizontally or vertically distributed databases is addressed. We show that minimizing the communication cost for multi-predicate queries is NP-hard and we propose a dynamic programming algorithm to compute the optimal solution for small problem instances. We also propose a low complexity, approximate, heuristic algorithm for solving larger problem instances efficiently and running it on nodes with low computational power (e.g. sensors). Finally, we present a variant of the Fermat point problem where distances between points are minimal paths in a weighted graph, and propose a solution. An extensive experimental evaluation compares the proposed algorithms to the best known technique used to evaluate queries in wireless sensor networks and shows improvement of 10% up to 95%. The low complexity heuristic algorithm is also shown to be scalable and robust to different query characteristics and network size.
Molecular similarity is an important tool in protein and drug design for analyzing the quantitative relationships between physicochemical properties of two molecules. We present a family of similarity measures which exploits the ability of wavelet transformation to analyze the spectral components of physicochemical properties and suggests a sensitive way for measuring similarities of biological molecules. In order to investigate how effective wavelet-based similarity measures were against conventional measures, we defined several patterns which involve scalar or topological changes in the distribution of electrostatic properties. The wavelet-based measures were more successful in discriminating these patterns in contrast to the current state-of-art similarity measures. We also present the validity of wavelet-based similarity measures through the hierarchical clustering of two protein datasets consisting of families of homologous domains and alanine scan mutants. This type of similarity analysis is useful for protein structure-function studies and protein design.
Abstract. This work shows how concepts from the electromagnetic field theory can be efficiently used in clustering with constraints. The proposed framework transforms vector data into a fully connected graph, or just works straight on the given graph data. User constraints are represented by electromagnetic fields that affect the weight of the graph's edges. A clustering algorithm is then applied on the adjusted graph, using k-distinct shortest paths as the distance measure. Our framework provides better accuracy compared to MPCK-Means, SS-KernelKMeans and Kmeans+Diagonal Metric even when very few constraints are used, significantly improves clustering performance on some datasets that other methods fail to partition successfully, and can cluster both vector and graph datasets. All these advantages are demonstrated through thorough experimental evaluation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.