Figure 1. Make-A-Scene: Samples of generated images from text inputs (a), and a text and scene input (b). Our method is able to both generate the scene (a, bottom left) and image, or generate the image from text and a simple sketch input (b, center).
While the availability of massive Text-Image datasets is shown to be extremely useful in training large-scale generative models (e.g. DDPMs, Transformers), their output typically depends on the quality of both the input text, as well as the training dataset. In this work, we show how largescale retrieval methods, in particular efficient K-Nearest-Neighbors (KNN) search, can be used in order to train a model to adapt to new samples. Learning to adapt enables several new capabilities. Sifting through billions of records at inference time is extremely efficient and can alleviate the need to train or memorize an adequately large generative model. Additionally, fine-tuning trained models to new samples can be achieved by simply adding them to the table. Rare concepts, even without any presence in the training set, can be then leveraged during test time without any modification to the generative model. Our diffusion-based model trains on images only, by leveraging a joint Text-Image multi-modal metric. Compared to baseline methods, our generations achieve state of the art results both in human evaluations as well as with perceptual scores when tested on a public multimodal dataset of natural images, as well as on a collected dataset of 400 million Stickers.
Highlights
We propose a novel end-to-end neural network that employs resting-state and task-based functional MRI (fMRI) datasets, obtained one month after trauma exposure, to predict PTSD one, six and 14-months after the exposure.
The method utilizes connectivity maps extracted from pairs of brain regions which are subsequently updated by applying the algorithmic technique of pairwise attention.
The proposed deep learning method predicts PTSD status, PTSD symptom clusters and survival analysis within the prospective design. We demonstrate a significant improvement in performance on all the datasets and experiments in comparison to other relevant analytical techniques.
Pairwise association analysis reveals several significant functional connectivity patterns, in line with previous PTSD neuroimaging literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.