Augmented TIRG for CBIR Using Combined Text and Image Features

Aboali, Mohamed; El-Maddah, Islam A. M.; Hassan, Hossam E. Abou-Bakr

doi:10.1109/icecet52533.2021.9698617

Cited by 3 publications

(8 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the parallel phase, the three networks were trained in parallel and on perfect data from the training dataset. The (1).…”

Section: Resultsmentioning

confidence: 99%

“…The later numbers are the cardinalities of the NetA input vector and NetB output vector per The proposed Network. NetC, the compositional network trained on the perfect setup as of Table (1) until MSE 0.003 and used for the three models. NetA trained per The proposed model on the training dataset as of Table (1).…”

Section: Resultsmentioning

confidence: 99%

“…The proposed methodology shares bases with studies in [1], [25], and [47]. These bases include the query inputs, the use of neural networks, and testing results on the Fashion 200K dataset.…”

Section: Methodsmentioning

confidence: 99%

“…In [1], The TIRG function was augmented by an optimization layer. The optimization layers included in the study are non-linear-MLP and linear regression function.…”

Section: A Comparative Study Compositional Methodsmentioning

confidence: 99%

“…Moreover, the same models used to produce the features associated to be associated with the image database. The use of pre-trained models for the same purpose was used in [1] NetB represents another mapping layer from text features of the output of NetC to the target image feature of the target image, NetB is, also, a fully connected neural network of the single hidden layer. NetB input vector cardinality is 512 and the output vector sizes are 512 in the case of ResNet 18, and 2048 in the other two cases.…”

Section: B the Proposed Methodsmentioning

confidence: 99%

See 4 more Smart Citations

Neural Textual Features Composition for CBIR

2023

Self Cite

View full text Add to dashboard Cite

Content Based Image Retrieval, CBIR, is a highly active leading research field with numerous applications that are currently expanding beyond traditional CBIR methodologies. In this paper, a CBIR methodology is proposed to meet such demands. Query inputs of the proposed methodology are an image and a text. For instance, having an image, a user would like to obtain a similar one with some modification described in text format that we refer to as a text-modifier. The proposed methodology uses a set of neural networks that operate in feature space and perform feature composition in a uniform-known domain which is the textual feature domain. In this methodology, ResNet is used to extract image features and LSTM to extract text features to form query inputs. The proposed methodology uses a set of three single-hiddenlayer non-linear feedforward networks in a cascading structure labeled NetA, NetC, and NetB. NetA maps image features into corresponding textual features. NetC composes the textual features produced by NetA with text-modifier features to form target image textual features. NetB maps target textual features to target image features that are used to recall the target image from the image-base based on cosine similarity. The proposed architecture was tested using ResNet 18, 50 and 152 for extracting image features. The testing results are promising and can compete with the most recent approaches to our knowledge as listed in section 5.

show abstract

“…In the parallel phase, the three networks were trained in parallel and on perfect data from the training dataset. The (1).…”

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

“…In [1], The TIRG function was augmented by an optimization layer. The optimization layers included in the study are non-linear-MLP and linear regression function.…”

Section: A Comparative Study Compositional Methodsmentioning

confidence: 99%

Section: B the Proposed Methodsmentioning

confidence: 99%

See 3 more Smart Citations