Connecting Vision and Language plays an essential role in Generative Intelligence. For this reason, large research efforts have been devoted to image captioning, i.e. describing images with syntactically and semantically meaningful sentences. Starting from 2015 the task has generally been addressed with pipelines composed of a visual encoder and a language model for text generation. During these years, both components have evolved considerably through the exploitation of object regions, attributes, the introduction of multi-modal connections, fully-attentive approaches, and BERT-like early-fusion strategies. However, regardless of the impressive results, research in image captioning has not reached a conclusive answer yet. This work aims at providing a comprehensive overview of image captioning approaches, from visual encoding and text generation to training strategies, datasets, and evaluation metrics. In this respect, we quantitatively compare many relevant state-of-the-art approaches to identify the most impactful technical innovations in architectures and training strategies. Moreover, many variants of the problem and its open challenges are discussed. The final goal of this work is to serve as a tool for understanding the existing literature and highlighting the future directions for a research area where Computer Vision and Natural Language Processing can find an optimal synergy.
Wind turbine upgrades have recently been spreading in the wind energy industry for optimizing the efficiency of the wind kinetic energy conversion. These interventions have material and labor costs; therefore, it is fundamental to estimate the production improvement realistically. Furthermore, the retrofitting of the wind turbines sited in complex environments might exacerbate the stress conditions to which those are subjected and consequently might affect the residual life. In this work, a two-step upgrade on a multimegawatt wind turbine is considered from a wind farm sited in complex terrain. First, vortex generators and passive flow control devices have been installed. Second, the management of the revolutions per minute has been optimized. In this work, a general method is formulated for assessing the wind turbine power upgrades using operational data. The method is based on the study of the residuals between the measured power output and a judicious model of the power output itself, before and after the upgrade. Therefore, properly selecting the model is fundamental. For this reason, an automatic feature selection algorithm is adopted, based on the stepwise multivariate regression. This allows identifying the most meaningful input variables for a multivariate linear model whose target is the power of the upgraded wind turbine. For the test case of interest, the adopted upgrade is estimated to increase the annual energy production to 2.6 ± 0.1%. The aerodynamic and control upgrades are estimated to be 1.8% and 0.8%, respectively, of the production improvement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.