“…Monocular visual scene understanding, which mainly focuses on high-level understanding of single image content, is a fundamental technology for many automatic applications, especially in the field of autonomous driving. Using only a single-view driving image, available vehicle parsing studies have covered popular topics starting from 2D vehicle detection [3,34,32,14,31,46], then 6D vehicle pose recovery [59,55,35,26,52,12,13,4,33], and finally vehicle shape reconstruction [10,28,50,20,2,24,15,29,36,61]. However, much less efforts are devoted to vehicle texture estimation.…”