Automatic machine classification of concrete structural defects in images poses significant challenges because of multitude of problems arising from the surface texture, such as presence of stains, holes, colors, poster remains, graffiti, marking and painting, along with uncontrolled weather conditions and illuminations. In this paper, we propose an interleaved deep artifacts-aware attention mechanism (iDAAM) to classify multitarget multi-class and single-class defects from structural defect images. Our novel architecture is composed of interleaved finegrained dense modules (FGDM) and concurrent dual attention modules (CDAM) to extract local discriminative features from concrete defect images. FGDM helps to aggregate multi-layer robust information with wide range of scales to describe visuallysimilar overlapping defects. On the other hand, CDAM selects multiple representations of highly localized overlapping defect features and encodes the crucial spatial regions from discriminative channels to address variations in texture, viewing angle, shape and size of overlapping defect classes. Within iDAAM, FGDM and CDAM are interleaved to extract salient discriminative features from multiple scales by constructing an end-toend trainable network without any preprocessing steps, making the process fully automatic. Experimental results and extensive ablation studies on three publicly available large concrete defect datasets show that our proposed approach outperforms the current state-of-the-art methodologies.
Fashion landmark detection is a fundamental task in several fashion image analysis problems. The associated challenges involving non-rigid structures and variations in style and orientation makes it extremely hard to accurately detect the landmarks. In this paper, we propose Appearance-Context network (ACNet), which encapsulates both global and local contextual information extending the axial attention mechanism. We design axial attention augmented local appearance network and introduce a novel Global-Context aware axial attention module which aggregates the global features attending discriminatory cues across height, width and channel axes. The proposed ACNet architecture outperforms existing methods on two large-scale fashion landmark datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.