2D image-based 3D object retrieval is a very important task in computer vision and big data management. Conventional image-based 3D object retrieval usually assumes that the images are from one single domain. However, for real applications, 2D images may be from multiple domains (e.g., real image, sketch, and quick draw). It raises significant challenges for this task since these 2D images have a great domain gap with each other as well as a great modality gap with 3D objects. To address these issues, we propose an unsupervised Domain-Specific Alignment Network (DSAN) for multidomain image-based 3D object retrieval. The proposed method aims to reduce domain discrepancy by domain-specific alignment network with multi-level moment matching, including first-order moment and second-order moment. Based on the observation that for any given sample, different domain classifiers should output the same label, we design a domain-specific classifier alignment module. To our knowledge, the proposed method is the first unsupervised work to align multiple-domain 2D images with 3D objects in an end-to-end manner. The multi-domain dataset MDI3D is utilized to advocate the research on this task, and the extensive experimental results demonstrate the superiority of the proposed method. CCS CONCEPTS • Computing methodologies → Computer vision; Visual contentbased indexing and retrieval; • Information systems → Multimedia and multimodal retrieval.