Internet users are benefiting from technologies of abstractive summarization enabling them to view articles on the internet by reading article summaries only instead of an entire article. However, there are disadvantages to technologies for analyzing articles with texts and images due to the semantic gap between vision and language. These technologies focus more on aggregating features and neglect the heterogeneity of each modality. At the same time, the lack of consideration of intrinsic data properties within each modality and semantic information from cross-modal correlations result in the poor quality of learned representations. Therefore, we propose a novel Inter- and Intra-modal Contrastive Hybrid learning framework which learns to automatically align the multimodal information and maintains the semantic consistency of input/output flows. Moreover, ITCH can be taken as a component to make the model suitable for both supervised and unsupervised learning approaches. Experiments on two public datasets, MMS and MSMO, show that the ITCH performances are better than the current baselines.
Video highlights are welcomed by audiences, and are composed of interesting or meaningful shots, such as funny shots. However, video shots of highlights are currently edited manually by video editors, which is inconvenient and consumes an enormous amount of time. A way to help video editors locate video highlights more efficiently is essential. Since interesting or meaningful highlights in videos usually imply strong sentiments, a sentiment analysis model is proposed to automatically recognize sentiments of video highlights by time-sync comments. As the comments are synchronized with video playback time, the model detects sentiment information in time series of user comments. Moreover, in the model, a sentimental intensity calculation method is designed to compute sentiments of shots quantitatively. The experiments show that our approach improves the F1 score by 12.8% and overlapped number by 8.0% compared with the best existing method in extracting sentiments of highlights and obtaining sentimental intensities, which provides assistance for video editors in editing video highlights efficiently.
The intelligent monitoring of tool wear status and wear prediction are important factors affecting the intelligent development of the modern machinery industry. Many scholars have used deep learning methods to achieve certain results in tool wear prediction. However, due to the instability and variability of the signal data, some neural network models may have gradient decay between layers. Most methods mainly focus on feature selection of the input data but ignore the influence degree of different features to tool wear. In order to solve these problems, this paper proposes a dual-stage attention model for tool wear prediction. A CNN-BiGRU-attention network model is designed, which introduces the self-attention to extract deep features and embody more important features. The IndyLSTM is used to construct a stable network to solve the gradient decay problem between layers. Moreover, the attention mechanism is added to the network to obtain the important information of output sequence, which can improve the accuracy of the prediction. Experimental study is carried out for tool wear prediction in a dry milling operation to demonstrate the viability of this method. Through the experimental comparison and analysis with regression prediction evaluation indexes, it proves the proposed method can effectively characterize the degree of tool wear, reduce the prediction errors, and achieve good prediction results.
GPS trajectories are always embedded with errors, due to the weather or environmental variables. Existing trajectory repairing methods have employed Kalman filters or sequential data cleaning methods. Kalman filter or its variants change all observed measurements, while generally most measurements are originally accurate. Sequential data cleaning methods are mainly applied on one-dimensional data sequences, and when encountering multi-dimensional trajectories, their performance will be compromised due to that the features of multi-dimensional trajectories are not fully utilized. To address these issues, we propose to repair GPS trajectory with movement tendencies, speed change tendency, travel distance tendency and repair distance tendency. We formalize the tendency based trajectory repairing, and propose an exact solution to find the repair which minimize movement tendency score. Then we propose high quality candidate selection and dynamic error range estimation, to improve the efficiency and effectiveness of exact solution. Experiments on three data sets demonstrate the superiority of our proposal.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.