Component-Based Attention for Large-Scale Trademark Retrieval

Tursun, Osman; Denman, Simon; Sivapalan, Sabesan; Sridharan, Sridha; Fookes, Clinton; Mau, Sandra

doi:10.1109/tifs.2019.2959921

Cited by 19 publications

(22 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Text removal methods are categorized into one-step [5] and two-step [8] based approaches. Here, related works on these two types of approach are discussed.…”

Section: Related Workmentioning

confidence: 99%

“…Early two-step approaches [4,9] are mainly based-on primitive hand-crafted text-detection and inpainting algorithms. A recent two-step approach with more accurate text localization is proposed by Tursun et al [8]. They proposed pixel level text segmentation for text localization, however, inpainting is implemented by replacing text pixels with the most-frequently occurring neighborhood background color around the text region.…”

Section: Related Workmentioning

confidence: 99%

“…The task has drawn the attention of the computer vision community due to its potential value for privacy protection [5,13], as text removal approaches automatically remove private information such as license plate numbers, addresses, and names from images shared via social media. It is also helpful for other applications including image editing, image restoration, [5,11] and image retrieval [8].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

MTRNet: A Generic Scene Text Eraser

Tursun

Zeng

Denman

et al. 2019

2019 International Conference on Document Analysis and Recognition (ICDAR)

Self Cite

View full text Add to dashboard Cite

Text removal algorithms have been proposed for unilingual scripts with regular shapes and layouts. However, to the best of our knowledge, a generic text removal method which is able to remove all or user-specified text regions regardless of font, script, language or shape is not available. Developing such a generic text eraser for real scenes is a challenging task, since it inherits all the challenges of multi-lingual and curved text detection and inpainting. To fill this gap, we propose a mask-based text removal network (MTRNet). MTRNet is a conditional adversarial generative network (cGAN) with an auxiliary mask. The introduced auxiliary mask not only makes the cGAN a generic text eraser, but also enables stable training and early convergence on a challenging large-scale synthetic dataset, initially proposed for text detection in real scenes. What's more, MTRNet achieves state-of-the-art results on several real-world datasets including ICDAR 2013, ICDAR 2017 MLT, and CTW1500, without being explicitly trained on this data, outperforming previous state-of-the-art methods trained directly on these datasets.

show abstract

“…Text removal methods are categorized into one-step [5] and two-step [8] based approaches. Here, related works on these two types of approach are discussed.…”

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

MTRNet: A Generic Scene Text Eraser

Tursun

Zeng

Denman

et al. 2019

2019 International Conference on Document Analysis and Recognition (ICDAR)

Self Cite

View full text Add to dashboard Cite

show abstract

“…At present, most trademark retrieval methods are based on deep learning extract trademark features by a supervised way [ 7 ]. Perez et al proposed a retrieval of trademarks through the combined VGG network [ 1 ] with supervised training, Tursun et al [ 9 ] removed the text of trademarks and combined soft and hard attention mechanisms to direct attention to key information. Lan et al [ 10 ] proposed a method to extract uniform Local Binary Pattern (LBP) features from the feature map of each convolutional layer feature, and achieved good results in both METU and NPU trademark datasets.…”

Section: Introductionmentioning

confidence: 99%

Unsupervised Trademark Retrieval Method Based on Attention Mechanism

Cao

Huang

Dai

et al. 2021

Sensors

View full text Add to dashboard Cite

Aiming at the high cost of data labeling and ignoring the internal relevance of features in existing trademark retrieval methods, this paper proposes an unsupervised trademark retrieval method based on attention mechanism. In the proposed method, the instance discrimination framework is adopted and a lightweight attention mechanism is introduced to allocate a more reasonable learning weight to key features. With an unsupervised way, this proposed method can obtain good feature representation of trademarks and improve the performance of trademark retrieval. Extensive comparative experiments on the METU trademark dataset are conducted. The experimental results show that the proposed method is significantly better than traditional trademark retrieval methods and most existing supervised learning methods. The proposed method obtained a smaller value of NAR (Normalized Average Rank) at 0.051, which verifies the effectiveness of the proposed method in trademark retrieval.

show abstract

“…Later, LSTR using content-based image retrieval (CBIR) algorithms have been used thanks to it's efficiency and accuracy. Hand-crafted features based-on shape, color or texture were developed for early CBIR-LSTR systems [2,3]. With the rise of deep learning, off-the-shelf deep features have been applied for LSTR, demonstrating higher accuracy and efficiency compared to traditional hand-crafted features.…”

Section: Introductionmentioning

confidence: 99%

Learning Regional Attention Over Multi-Resolution Deep Convolutional Features For Trademark Retrieval

Tursun

Denman

Sridharan

et al. 2021

2021 IEEE International Conference on Image Processing (ICIP)

Self Cite

View full text Add to dashboard Cite

Large-scale trademark retrieval is an important content-based image retrieval task. A recent study shows that off-theshelf deep features aggregated with Regional-Maximum Activation of Convolutions (R-MAC) achieve state-of-theart results. However, R-MAC suffers in the presence of background clutter/trivial regions and scale variance, and discards important spatial information. We introduce three simple but effective modifications to R-MAC to overcome these drawbacks. First, we propose the use of both sum and max pooling to minimise the loss of spatial information. We also employ domain-specific unsupervised soft-attention to eliminate background clutter and unimportant regions. Finally, we add multi-resolution inputs to enhance the scaleinvariance of R-MAC. We evaluate these three modifications on the million-scale METU dataset. Our results show that all modifications bring non-trivial improvements, and surpass previous state-of-the-art results.

show abstract

Component-Based Attention for Large-Scale Trademark Retrieval

Cited by 19 publications

References 35 publications

MTRNet: A Generic Scene Text Eraser

MTRNet: A Generic Scene Text Eraser

Unsupervised Trademark Retrieval Method Based on Attention Mechanism

Learning Regional Attention Over Multi-Resolution Deep Convolutional Features For Trademark Retrieval

Contact Info

Product

Resources

About