Modeling layout is an important first step for graphic design. Recently, methods for generating graphic layouts have progressed, particularly with Generative Adversarial Networks (GANs). However, the problem of specifying the locations and sizes of design elements usually involves constraints with respect to element attributes, such as area, aspect ratio and reading-order. Automating attribute conditional graphic layouts remains a complex and unsolved problem. In this paper, we introduce Attribute-conditioned Layout GAN to incorporate the attributes of design elements for graphic layout generation by forcing both the generator and the discriminator to meet attribute conditions. Due to the complexity of graphic designs, we further propose an element dropout method to make the discriminator look at partial lists of elements and learn their local patterns. In addition, we introduce various loss designs following different design principles for layout optimization. We demonstrate that the proposed method can synthesize graphic layouts conditioned on different element attributes. It can also adjust well-designed layouts to new sizes while retaining elements' original reading-orders. The effectiveness of our method is validated through a user study.
Accurate and reliable perception systems are essential for autonomous driving and robotics. To achieve this, 3D object detection with multi-sensors is necessary. Existing 3D detectors have significantly improved accuracy by adopting a two-stage paradigm that relies solely on LiDAR point clouds for 3D proposal refinement. However, the sparsity of point clouds, particularly for faraway points, makes it difficult for the LiDAR-only refinement module to recognize and locate objects accurately. To address this issue, we propose a novel multi-modality two-stage approach called FusionRCNN. This approach effectively and efficiently fuses point clouds and camera images in the Regions of Interest (RoI). The FusionRCNN adaptively integrates both sparse geometry information from LiDAR and dense texture information from the camera in a unified attention mechanism. Specifically, FusionRCNN first utilizes RoIPooling to obtain an image set with a unified size and gets the point set by sampling raw points within proposals in the RoI extraction step. Then, it leverages an intra-modality self-attention to enhance the domain-specific features, followed by a well-designed cross-attention to fuse the information from two modalities. FusionRCNN is fundamentally plug-and-play and supports different one-stage methods with almost no architectural changes. Extensive experiments on KITTI and Waymo benchmarks demonstrate that our method significantly boosts the performances of popular detectors. Remarkably, FusionRCNN improves the strong SECOND baseline by 6.14% mAP on Waymo and outperforms competing two-stage approaches.
Polarization multispectral imaging (PMI) has been applied widely with the ability of characterizing physicochemical properties of objects. However, traditional PMI relies on scanning each domain, which is time-consuming and occupies vast storage resources. Therefore, it is imperative to develop advanced PMI methods to facilitate real-time and cost-effective applications. In addition, PMI development is inseparable from preliminary simulations based on full-Stokes polarization multispectral images (FSPMI). Whereas, FSPMI measurements are always necessary due to the lack of relevant databases, which is extremely complex and severely limits PMI development. In this paper, we therefore publicize abundant FSPMI with 512 × 512 spatial pixels measured by an established system for 67 stereoscopic objects. In the system, a quarter-wave plate and a linear polarizer are rotated to modulate polarization information, while bandpass filters are switched to modulate spectral information. The required FSPMI are finally calculated from designed 5 polarization modulation and 18 spectral modulation. The publicly available FSPMI database may have the potential to greatly promote PMI development and application.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.