Clustering is to discover latent group structure in data and is a fundamental problem in artificial intelligence, and a vital procedure in data-driven scientific research over all disciplines. Yet, existing methods have various limitations, especially weak cognitive interpretability, and poor computational scalability, in clustering massive datasets that are increasingly available in all domains. Here, by simulating the multi-scale cognitive observation process of Humans, we design a scalable algorithm to detect clusters hierarchically hidden in massive datasets. The observation scale changes following the Weber–Fechner Law to capture the gradually emerging meaningful grouping structure. We validated our approach in real datasets with up to a billion records and 2,000 dimensions, including taxi trajectories, single-cell gene expressions, face images, computer logs, and audios. Our approach outperformed popular methods in usability, efficiency, effectiveness, and robustness across different domains.
The emerging edge-cloud collaborative Deep Learning (DL) paradigm aims at improving the performance of practical DL implementations in terms of cloud bandwidth consumption, response latency, and data privacy preservation. Focusing on bandwidth efficient edge-cloud collaborative training of DNN-based classifiers, we present CDC, a Classification Driven Compression framework that reduces bandwidth consumption while preserving classification accuracy of edge-cloud collaborative DL. Specifically, to reduce bandwidth consumption, for resource-limited edge servers, we develop a lightweight autoencoder with a classification guidance for compression with classification driven feature preservation, which allows edges to only upload the latent code of raw data for accurate global training on the Cloud. Additionally, we design an adjustable quantization scheme adaptively pursuing the tradeoff between bandwidth consumption and classification accuracy under different network conditions, where only fine-tuning is required for rapid compression ratio adjustment. Results of extensive experiments demonstrate that, compared with DNN training with raw data, CDC consumes 14.9× less bandwidth with an accuracy loss no more than 1.06%, and compared with DNN training with data compressed by AE without guidance, CDC introduces at least 100% lower accuracy loss.
The broadband impedance of converters is an essential feature for the stability analysis of new energy sources. However, obtaining the impedance for photovoltaic (PV) converters with Maximum Power Point Tracking (MPPT) control is challenging because of their non‐linear control schemes and real‐time changing operating points. In this case, conventional linearized impedance modelling methods are not applicable, and conventional direct measurement approaches are highly time‐consuming. This paper proposes a few‐shot learning approach for quick access to PV converters’ impedance. The proposed method is based on the model agnostic meta‐learning (MAML) algorithm, suitable for MPPT‐controlled converters whose impedance changes with time, temperature, and irradiation. In the training process, it adjusts the initial model of the machine learning algorithm under different weather conditions. After completing the training, the initial model can adapt to a new condition with very few samples. Under this approach, with only a few data measured at several frequency points, broadband impedance for MPPT‐controlled converters under any weather conditions can be accurately predicted, avoiding time‐consuming measurements and inaccurate prediction of existing methods. Contrast simulation results show the effectiveness and superiority of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.