Bridging robot action sequences and their natural language captions is an important task to increase explainability of human assisting robots in their recently evolving field. In this paper, we propose a system for generating natural language captions that describe behaviors of human assisting robots. The system describes robot actions by using robot observations; histories from actuator systems and cameras, toward end-to-end bridging between robot actions and natural language captions. Two reasons make it challenging to apply existing sequence-tosequence models to this mapping: 1) it is hard to prepare a large-scale dataset for any kinds of robots and their environment, and 2) there is a gap between the number of samples obtained from robot action observations and generated word sequences of captions. We introduced unsupervised segmentation based on K-means clustering to unify typical robot observation patterns into a class. This method makes it possible for the network to learn the relationship from a small amount of data. Moreover, we utilized a chunking method based on byte-pair encoding (BPE) to fill in the gap between the number of samples of robot action observations and words in a caption. We also applied an attention mechanism to the segmentation task. Experimental results show that the proposed model based on unsupervised learning can generate better descriptions than other methods. We also show that the attention mechanism did not work well in our low-resource setting.
An advertising slogan is a sentence that expresses a product or a work of art in a straightforward manner and is used for advertising and publicity. Moving the consumer's mind and attracting their interest can significantly influence sales. Although rhetorical techniques in a slogan are known to improve the effectiveness of advertising, not much attention has been devoted to analyze or automatically generate sentences with the techniques. Therefore, we constructed a large corpus of slogans and revealed the linguistic characteristics of the basic statistics and rhetorical devices. Another point of focus was antitheses, of which the usage rates are relatively high and which have a specific sentence structure and lexical constraints. The generation of a slogan that contains an antithesis necessitates the structure of sentences, known as templates, to be extracted and also requires knowledge of word pairs with semantic contrast. Thus, the next step involved analysis of the structure to extract the sentence structure and lexical knowledge about the antithesis. Despite its simple architecture, the proposed method exceeds the prediction accuracy and efficiency of a comparable method. Lexical knowledge that is not available in existing dictionaries was also extracted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.