Existing methods of spatial data clustering have focused on point data, whose similarity can be easily defined. Due to the complex shapes and alignments of polygons, the similarity between non-overlapping polygons is important to cluster polygons. This study attempts to present an efficient method to discover clustering patterns of polygons by incorporating spatial cognition principles and multilevel graph partition. Based on spatial cognition on spatial similarity of polygons, four new similarity criteria (i.e. the distance, connectivity, size and shape) are developed to measure the similarity between polygons, and used to visually distinguish those polygons belonging to the same clusters from those to different clusters. The clustering method with multilevel graph-partition first coarsens the graph of polygons at multiple levels, using the four defined similarities to find clusters with maximum similarity among polygons in the same clusters, then refines the obtained clusters by keeping minimum similarity between different clusters. The presented method is a general algorithm for discovering clustering patterns of polygons and can satisfy various demands by changing the weights of distance, connectivity, size and shape in spatial similarity. The presented method is tested by clustering residential areas and buildings, and the results demonstrate its usefulness and universality. but are not limited to, watershed analysis, drought analysis, crime mapping, and spatial epidemiology (Joshi 2011). Therefore, spatial clustering analysis is a technique which is important to spatial data analysis and related applications.The spatial similarities between spatial objects are fundamental to clustering analysis, and many similarity measurements have been presented. Chameleon ) is a general algorithm exclusively defining the framework but not the similarity between polygon data; GDBSCAN (Sander et al. 1998) uses the areas of polygons or non-spatial attributes as similarities; REDCAP (Guo 2008, Guo and Wang 2011) only considers the distance of geometric properties and non-spatial attributes; Poly-SNN (Wang and Eick 2014) first approximates overlapping polygons as points, and then defines the similarity of polygons as the number of the points shared by their k-nearest neighbors. However, these works have not taken full advantage of geometric properties, and were not designed for measuring similarity of polygons, especially non-overlapping ones. That is, when polygons are disjoint or non-overlapping, geometric relationships and configurations between them, and visual cognition, will play a more important role than the existing measurements and non-spatial attributes in determining clusters. In this situation, the existing measurements can no longer work efficiently. Although some measures have been presented recently to group and generalize buildings (Li et al. 2004;Basaraner and Selcuk 2008;Yan et al. 2008; Zhang et al. 2013a, b), they are designed to handle buildings, which have more regular and simple shapes. These specific ...
Accurate screening on cancer biomarkers contributes to health assessment, drug screening, and targeted therapy for precision medicine. The rapid development of high-throughput sequencing technology has identified abundant genomic biomarkers, but most of them are limited to single-cancer analysis. Based on the combination of Fisher score, Recursive feature elimination, and Logistic regression (FRL), this paper proposes an integrative feature selection algorithm named FRL to explore potential cancer genomic biomarkers on cancer subsets. Fisher score is initially used to calculate the weights of genes to rapidly reduce the dimension. Recursive feature elimination and Logistic regression are then jointly employed to extract the optimal subset. Compared to the current differential expression analysis tool GEO2R based on the Limma algorithm, FRL has greater classification precision than Limma. Compared with five traditional feature selection algorithms, FRL exhibits excellent performance on accuracy (ACC) and F1-score and greatly improves computational efficiency. On high-noise datasets such as esophageal cancer, the ACC of FRL is 30% superior to the average ACC achieved with other traditional algorithms. As biomarkers found in multiple studies are more reliable and reproducible, and reveal stronger association on potential clinical value than single analysis, through literature review and spatial analyses of gene functional enrichment and functional pathways, we conduct cluster analysis on 10 diverse cancers with high mortality and form a potential biomarker module comprising 19 genes. All genes in this module can serve as potential biomarkers to provide more information on the overall oncogenesis mechanism for the detection of diverse early cancers and assist in targeted anticancer therapies for further developments in precision medicine.
To improve the informationization and intelligence of the energy Internet industry and enhance the capability of knowledge services, it is necessary to organize the energy Internet body of knowledge from existing knowledge resources of the State Grid, which have the characteristics of large scale, multiple sources, and heterogeneity. At the same time, the business fields of State Grid cover a wide range. There are many sub-fields under each business field, and the relationship between fields is diverse and complex. The key to establishing the energy Internet body of knowledge is how to fuse the heterogeneous knowledge resources from multiple sources, extract the knowledge contents from them, and organize the different relationships. This paper considers transforming the original knowledge resources of State Grid into a unified and well-organized knowledge system described in OWL language to meet the requirements of heterogeneous resource integration, multi-source resource organization, and knowledge service provision. For the State Grid knowledge resources mainly in XML format, this paper proposes a Knowledge Automatic Fusion and Organization idea and method based on XSD Directed Graph. According to the method, the XML corresponding XSD documents are transformed into a directed graph in the first stage during which the graph neural network detects hidden knowledge inside the structure to add semantic information to the graph. In the second stage, for other structured knowledge resources (e.g., databases, spreadsheets), the knowledge contents and the relationships are analyzed manually to establish the mappings from structured resources to graph structures, using which the original knowledge resources are transformed into graph structures, and merged with the directed graphs obtained in the first stage to achieve the fusion of heterogeneous knowledge resources. And expert knowledge is introduced for heterogeneous knowledge fusion to further extend the directed graph. And in the third stage, the expanded directed graph is converted to the body of knowledge in the form of OWL. This paper takes the knowledge resources in the field of human resources of the State Grid as an example, to establish the ontology of the human resources training field in a unified manner, initially demonstrating the effectiveness of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.