Every day 1.8+ billion images are being uploaded to Facebook, Instagram, Flickr, Snapchat, and WhatsApp [6]. The exponential growth of visual media has made quality assessment become increasingly important for various applications, from image acquisition, synthesis, restoration, and enhancement, to image search and retrieval, storage, and recognition. There have been two related but different classes of visual quality assessment techniques: image quality assessment (IQA) and image aesthetics assessment (IAA). As perceptual assessment tasks, subjective IQA and IAA share some common underlying factors that affect user judgments. Moreover, they are similar in methodology (especially NR-IQA in-the-wild and IAA). However, the emphasis for each is different: IQA focuses on low-level defects e.g. processing artefacts, noise, and blur, while IAA puts more emphasis on abstract and higher-level concepts that capture the subjective aesthetics experience, e.g. established photographic rules encompassing lighting, composition, and colors, and personalized factors such as personality, cultural background, age, and emotion. IQA has been studied extensively over the last decades [3, 14, 22]. There are three main types of IQA methods: full-reference (FR), reduced-reference (RR), and no-reference (NR). Among these, NR-IQA is the most challenging as it does not depend on reference images or impose strict assumptions on the distortion types and level. NR-IQA techniques can be further divided into those that predict the global image score [1, 2, 10, 17, 26] and patch-based IQA [23, 25], naming a few of the more recent approaches. Comparatively, IAA received less research attention than IQA until recently [5]. Pioneering IAA approaches adopted handcrafted features for training an aesthetics model, while earlier approaches used global and regional features [4, 12] and the later approaches employed subject-focused features [16, 24] and generic descriptors