“…We used the following 20 classes to create the Places subset: aquarium, athletic_field/outdoor, beach, cliff, coast, forest_path, golf_course, harbor, lake/natural, mountain, ocean, pier, pond, rainforest, river, skyscraper, swamp, underwater/ocean_deep, valley, vegetable_garden. The Scenery dataset consists of 5000 training and 1040 test images, which is different from the actual dataset combination used in VLNS [2]. While most of the Scenery datasets are images of mountains, the Places subset is composed of images with diverse categories, and thus it is relatively difficult [30]- [32] to stably train a generator with the Places subset.…”