The sequencing by hybridization (SBH) of determining the order in which nucleotides should occur on a DNA string is still under discussion for enhancements on computational intelligence although the next generation of DNA sequencing has come into existence. In the last decade, many works related to graph theory-based DNA sequencing have been carried out in the literature. This paper proposes a method for SBH by integrating hypergraph with genetic algorithm (HGGA) for designing a novel analytic technique to obtain DNA sequence from its spectrum. The paper represents elements of the spectrum and its relation as hypergraph and applies the unimodular property to ensure the compatibility of relations between l-mers. The hypergraph representation and unimodular property are bound with the genetic algorithm that has been customized with a novel selection and crossover operator reducing the computational complexity with accelerated convergence. Subsequently, upon determining the primary strand, an anti-homomorphism is invoked to find the reverse complement of the sequence. The proposed algorithm is implemented in the GenBank BioServer datasets, and the results are found to prove the efficiency of the algorithm. The HGGA is a non-classical algorithm with significant advantages and computationally attractive complexity reductions ranging to [Formula: see text] with improved accuracy that makes it prominent for applications other than DNA sequencing like image processing, task scheduling and big data processing.
Volunteered geographic information (VGI) encourages citizens to contribute geographic data voluntarily that helps to enhance geospatial databases. VGI’s significant limitations are trustworthiness and reliability concerning data quality due to the anonymity of data contributors. We propose a data-driven model to address these issues on OpenStreetMap (OSM), a particular case of VGI in recent times. This research examines the hypothesis of evaluating the proficiency of the contributor to assess the credibility of the data contributed. The proposed framework consists of two phases, namely, an exploratory data analysis phase and a learning phase. The former explores OSM data history to perform feature selection, resulting in “OSM Metadata” summarized using principal component analysis. The latter combines unsupervised and supervised learning through K-means for user-clustering and multi-class logistic regression for user classification. We identified five major classes representing user-proficiency levels based on contribution behavior in this study. We tested the framework with India OSM data history, where 17% of users are key contributors, and 27% are unexperienced local users. The results for classifying new users are satisfactory with 95.5% accuracy. Our conclusions recognize the potential of OSM metadata to illustrate the user’s contribution behavior without the knowledge of the user’s profile information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.