2022
DOI: 10.3390/s22093433
|View full text |Cite
|
Sign up to set email alerts
|

Effective Pre-Training Method and Its Compositional Intelligence for Image Captioning

Abstract: With the increase in the performance of deep learning models, the model parameter has increased exponentially. An increase in model parameters leads to an increase in computation and training time, i.e., an increase in training cost. To reduce the training cost, we propose Compositional Intelligence (CI). This is a reuse method that combines pre-trained models for different tasks. Since the CI uses a well-trained model, good performance and small training cost can be expected in the target task. We applied the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 41 publications
0
2
0
Order By: Relevance
“…Our best model was trained with the capsule network and inception-V3 as a feature extractor, with caption enrichment by an external contextual description. The results are the basis for future research that will generate more conceptual and specific descriptions by considering emotions in captions and using transformers in the decoder since this network have extraordinary performance in image captioning [ 54 ].…”
Section: Discussionmentioning
confidence: 99%
“…Our best model was trained with the capsule network and inception-V3 as a feature extractor, with caption enrichment by an external contextual description. The results are the basis for future research that will generate more conceptual and specific descriptions by considering emotions in captions and using transformers in the decoder since this network have extraordinary performance in image captioning [ 54 ].…”
Section: Discussionmentioning
confidence: 99%
“…These pre-trained models have been trained on large and diverse datasets and learned highly discriminative features useful for a wide range of image classification tasks. By reusing pre-trained models, we can reduce the computational cost and training time required for a new task and achieve improved performance [18], especially in the medical domain where large, diverse datasets are scarce.…”
Section: Introductionmentioning
confidence: 99%