“…Consequently, different types of approaches have been proposed to handle this problem [1], [2], [3], [4], and several works have attempted to recover 3D information from 2D images (rendered view [5], [6], [7], [8], [9], scene [10], [11], [12], sketch [13], [14], [15]). In addition, some cross-modal 3D retrieval methods [16], [17], [18] are used to search and match the 3D models in databases, which reduces the difficulty of acquiring models, but still falls short of human expectations in terms of the accuracy and matching requirements.…”