Predicting the future is a fantasy but practicality work. It is the key component to intelligent agents, such as self-driving vehicles, medical monitoring devices and robotics. In this work, we consider generating unseen future frames from previous observations, which is notoriously hard due to the uncertainty in frame dynamics. While recent works based on generative adversarial networks (GANs) made remarkable progress, there is still an obstacle for making accurate and realistic predictions. In this paper, we propose a novel GAN based on inter-frame difference to circumvent the difficulties. More specifically, our model is a multi-stage generative network, which is named the Difference Guided Generative Adversarial Network (DGGAN). The DGGAN learns to explicitly enforce future-frame predictions that is guided by synthetic interframe difference. Given a sequence of frames, DGGAN first uses dual paths to generate meta information. One path, called Coarse Frame Generator, predicts the coarse details about future frames, and the other path, called Difference Guide Generator, generates the difference image which include complementary fine details. Then our coarse details will then be refined via guidance of difference image under the support of GANs. With this model and novel architecture, we achieve state-of-theart performance for future video prediction on UCF-101, KITTI.
Generative adversarial networks (GANs) have proven successful in image generation tasks. However, GAN training is inherently unstable. Although many works try to stabilize it by manually modifying GAN architecture, it requires much expertise. Neural architecture search (NAS) has become an attractive solution to search GANs automatically. The early NAS-GANs search only generators to reduce search complexity but lead to a sub-optimal GAN. Some recent works try to search both generator (G) and discriminator (D), but they suffer from the instability of GAN training. To alleviate the instability, we propose an efficient two-stage evolutionary algorithm-based NAS framework to search GANs, namely EAGAN. We decouple the search of G and D into two stages, where stage-1 searches G with a fixed D and adopts the many-to-one training strategy, and stage-2 searches D with the optimal G found in stage-1 and adopts the one-to-one training and weightresetting strategies to enhance the stability of GAN training. Both stages use the non-dominated sorting method to produce Pareto-front architectures under multiple objectives (e.g., model size, Inception Score (IS), and Fréchet Inception Distance (FID)). EAGAN is applied to the unconditional image generation task and can efficiently finish the search on the CIFAR-10 dataset in 1.2 GPU days. Our searched GANs achieve competitive results (IS=8.81±0.10, FID=9.91) on the CIFAR-10 dataset and surpass prior NAS-GANs on the STL-10 dataset (IS=10.44±0.087, FID=22.18). Source code: https://github.com/marsggbo/EAGAN.
The COVID-19 pandemic has threatened global health. Many studies have applied deep convolutional neural networks (CNN) to recognize COVID-19 based on chest 3D computed tomography (CT). Recent works show that no model generalizes well across CT datasets from different countries, and manually designing models for specific datasets requires expertise; thus, neural architecture search (NAS) that aims to search models automatically has become an attractive solution. To reduce the search cost on large 3D CT datasets, most NAS-based works use the weight-sharing (WS) strategy to make all models share weights within a supernet; however, WS inevitably incurs search instability, leading to inaccurate model estimation. In this work, we propose an efficient Evolutionary Multi-objective ARchitecture Search (EMARS) framework. We propose a new objective, namely potential, which can help exploit promising models to indirectly reduce the number of models involved in weights training, thus alleviating search instability. We demonstrate that under objectives of accuracy and potential, EMARS can balance exploitation and exploration, i.e., reducing search time and finding better models. Our searched models are small and perform better than prior works on three public COVID-19 3D CT datasets.
COVID-19 pandemic has spread globally for months. Due to its long incubation period and high testing cost, there is no clue showing its spread speed is slowing down, and hence a faster testing method is in dire need. This paper proposes an efficient Evolutionary Multi-objective neural ARchitecture Search (EMARS) framework, which can automatically search for 3D neural architectures based on a well-designed search space for COVID-19 chest CT scan classification. Within the framework, we use weight sharing strategy to significantly improve the search efficiency and finish the search process in 8 hours. We also propose a new objective, namely potential, which is of benefit to improve the search process's robustness. With the objectives of accuracy, potential, and model size, we find a lightweight model (3.39 MB), which outperforms three baseline humandesigned models, i.e., ResNet3D101 (325.21 MB), DenseNet3D121 (43.06 MB), and MC3 18 (43.84 MB). Besides, our well-designed search space enables the class activation mapping algorithm to be easily embedded into all searched models, which can provide the interpretability for medical diagnosis by visualizing the judgment based on the models to locate the lesion areas.
Generative Adversarial Networks (GANs) have been proven hugely successful in image generation tasks, but GAN training has the problem of instability. Many works have improved the stability of GAN training by manually modifying the GAN architecture, which requires human expertise and extensive trial-and-error. Thus, neural architecture search (NAS), which aims to automate the model design, has been applied to search GANs on the task of unconditional image generation. The early NAS-GAN works only search generators for reducing difficulty. Some recent works have attempted to search both generator (G) and discriminator (D) to improve GAN performance, but they still suffer from the instability of GAN training during the search. To alleviate the instability issue, we propose an efficient two-stage evolutionary algorithm (EA) based NAS framework to discover GANs, dubbed EAGAN. Specifically, we decouple the search of G and D into two stages and propose the weight-resetting strategy to improve the stability of GAN training. Besides, we perform evolution operations to produce the Pareto-front architectures based on multiple objectives, resulting in a superior combination of G and D. By leveraging the weight-sharing strategy and low-fidelity evaluation, EAGAN can significantly shorten the search time. EAGAN achieves highly competitive results on the CIFAR-10 (IS=8.81±0.10, FID=9.91) and surpasses previous NAS-searched GANs on the STL-10 dataset (IS=10.44±0.087, FID=22.18).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.