Rapid and precise detection and classification of vehicles are vital for the intelligent transportation systems (ITSs). However, due to small gaps between vehicles on the road and interference features of photos, or video frames, containing vehicle images, it is difficult to detect and identify vehicle types quickly and precisely. For solving this problem, a new vehicle detection and classification model, named YOLOv4_AF, is proposed in this paper, based on an optimization of the YOLOv4 model. In the proposed model, an attention mechanism is utilized to suppress the interference features of photos through both the channel dimension and spatial dimension. In addition, a modification on the Feature Pyramid Network (FPN) part of the Path Aggregation Network (PAN), utilized by YOLOv4, is applied in order to enhance further the effective features through down-sampling. This way, the objects can be steadily positioned in the 3D space and the object detection and classification performance of the model can be improved. The results, obtained through experiments conducted on two public data sets, demonstrate that the proposed YOLOv4_AF model outperforms, in this regard, both the original YOLOv4 model and two other state-of-the-art models, Faster R-CNN and EfficientDet, in terms of the mean average precision (mAP) and F1 score, by achieving respective values of 83.45% and 0.816 on the BIT-Vehicle data set, and 77.08% and 0.808 on the UA-DETRAC data set.
The automatic generation of a text summary is a task of generating a short summary for a relatively long text document by capturing its key information. In the past, supervised statistical machine learning was widely used for this Automatic Text Summarization (ATS) task, but due to its high dependence on the quality of text features, the generated summaries lack accuracy and coherence, while the computational power involved, and performance achieved, could not easily meet the current needs. This paper proposes four novel ATS models with a Sequence-to-Sequence (Seq2Seq) structure, utilizing an attention-based bidirectional Long Short-Term Memory (LSTM), with added enhancements for increasing the correlation between the generated text summary and the source text, and solving the problem of unregistered words, suppressing the repeated words, and preventing the spread of cumulative errors in generated text summaries. Experiments conducted on two public data sets confirmed that the proposed ATS models achieve indeed better performance than the baselines and some of the state-of-the-art models considered.
In conventional profile monitoring problems, profiles for products or process runs are assumed to have the same length. Statistical monitoring cannot be implemented until a complete profile is obtained. However, in certain cases, a single profile may require several days to generate, so it is important to monitor the profile trajectory to detect unexpected changes during the long processing cycle. Motivated by an ingot growth process in semiconductor manufacturing, we propose a method for monitoring growth profile trajectories of unequal lengths. The profiles are first aligned using the dynamic time warping algorithm and then averaged to generate a baseline. Online monitoring of trajectories is performed based on incomplete growth profiles. Both simulations and an actual application are used to demonstrate the use of the proposed method. Copyright © 2014 John Wiley & Sons, Ltd.
Object detection and image recognition are some of the most significant and challenging branches in the field of computer vision. The prosperous development of unmanned driving technology has made the detection and recognition of traffic signs crucial. Affected by diverse factors such as light, the presence of small objects, and complicated backgrounds, the results of traditional traffic sign detection technology are not satisfactory. To solve this problem, this paper proposes two novel traffic sign detection models, called YOLOv5-DH and YOLOv5-TDHSA, based on the YOLOv5s model with the following improvements (YOLOv5-DH uses only the second improvement): (1) replacing the last layer of the ‘Conv + Batch Normalization + SiLU’ (CBS) structure in the YOLOv5s backbone with a transformer self-attention module (T in the YOLOv5-TDHSA’s name), and also adding a similar module to the last layer of its neck, so that the image information can be used more comprehensively, (2) replacing the YOLOv5s coupled head with a decoupled head (DH in both models’ names) so as to increase the detection accuracy and speed up the convergence, and (3) adding a small-object detection layer (S in the YOLOv5-TDHSA’s name) and an adaptive anchor (A in the YOLOv5-TDHSA’s name) to the YOLOv5s neck to improve the detection of small objects. Based on experiments conducted on two public datasets, it is demonstrated that both proposed models perform better than the original YOLOv5s model and three other state-of-the-art models (Faster R-CNN, YOLOv4-Tiny, and YOLOv5n) in terms of the mean accuracy (mAP) and F1 score, achieving mAP values of 77.9% and 83.4% and F1 score values of 0.767 and 0.811 on the TT100K dataset, and mAP values of 68.1% and 69.8% and F1 score values of 0.71 and 0.72 on the CCTSDB2021 dataset, respectively, for YOLOv5-DH and YOLOv5-TDHSA. This was achieved, however, at the expense of both proposed models having a bigger size, greater number of parameters, and slower processing speed than YOLOv5s, YOLOv4-Tiny and YOLOv5n, surpassing only Faster R-CNN in this regard. The results also confirmed that the incorporation of the T and SA improvements into YOLOv5s leads to further enhancement, represented by the YOLOv5-TDHSA model, which is superior to the other proposed model, YOLOv5-DH, which avails of only one YOLOv5s improvement (i.e., DH).
Remote sensing image target object detection and recognition are widely used both in military and civil fields. There are many models proposed for this purpose, but their effectiveness on target object detection in remote sensing images is not ideal due to the influence of climate conditions, obstacles and confusing objects presented in images, image clarity, and associated problems with small-target and multi-target detection and recognition. Therefore, how to accurately detect target objects in images is an urgent problem to be solved. To this end, a novel model, called YOLOv4_CE, is proposed in this paper, based on the classical YOLOv4 model with added improvements, resulting from replacing the backbone feature-extraction network with a ConvNeXt network, replacing the Complete Intersection over Union (CIoU) loss with the Efficient Intersection over Union (EIoU) loss, and adding a coordinate attention mechanism to YOLOv4 as to improve the remote sensing image detection capabilities. The results, obtained through experiments conducted on two open data sets, demonstrate that the proposed YOLOv4_CE model outperforms, in this regard, both the original YOLOv4 model and four other state-of-the-art models, namely Faster R-CNN, Gliding Vertex, Oriented R-CNN, and EfficientDet, in terms of the mean average precision (mAP) and F1 score, by achieving respective values of 95.03% and 0.933 on the NWPU VHR-10 data set, and 95.89% and 0.937 on the RSOD data set.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.