“…However, the conventional recommender systems are mainly developed using item IDs and textual information, which fail to leverage the important visual signals for recommendation. The rapid development of computer vision area has significantly promoted various visualbased applications, such as image retrieval [25,53,26,43,52,5,50,49], visual understanding [42,28,6], and visual domain adaptation [29,27,41]. This also has largely facilitated the studies in the fashion area.…”