In order to safely and comfortably navigate in the complex urban traffic, it is necessary to make multi-modal predictions of autonomous vehicles for the next trajectory of various traffic participants, with the continuous movement trend and inertia of the surrounding traffic agents taken into account. At present, most trajectory prediction methods focus on prediction on future behavior of traffic agents but with limited, consideration of the response of traffic agents to the future behavior of the ego-agent. Moreover, it can only predict the trajectory of single-type agents, which make it impossible to learn interaction in a complex environment between traffic agents. In this paper, we proposed a graph-based heterogeneous traffic agents trajectory prediction model LSTGHP, which consists of the following three parts: (1) layered spatio-temporal graph module; (2) ego-agent motion module; (3) trajectory prediction module, which can realize multi-modal prediction of future trajectories of traffic agents with different semantic categories in the scene. To evaluate its performance, we collected trajectory datasets of heterogeneous traffic agents in a time-varying, highly dynamic urban intersection environment, where vehicles, bicycles, and pedestrians interacted with each other in the scene. It can be drawn from experimental results that our model can improve its prediction accuracy while interacting at a close range. Compared with the previous prediction methods, the model has less prediction error in the trajectory prediction of heterogeneous traffic agents.
Predicting the future trajectories of multiple pedestrians in certain scenes is critical for autonomous moving platforms (like, self-driving cars and social robots). In this paper, we propose a novel Generative Adversarial Network model with Transformers, which simulates the pedestrian distribution to capture the uncertainty of the predicted paths and generate more reasonable future trajectories. The design of our method includes a generator and a discriminator. The generator mainly contains an encoder, a decoder, and a prediction module. Specifically, the encoder and the decoder comprise multihead convolutional selfattention to learn the sequence of historical movement, and the prediction module incorporates the Mish Feed-Forward Network to yield the predicted target. The discriminator takes both the predicted paths and ground truth as input, classifies them as socially acceptable or not. Experimental results show that the proposed method consistently boosts the performance of trajectory forecasting, and our framework surpasses several existing baselines by evaluating the results on various data sets. Code is available at https://github. com/lzz970818/Trajectory-Prediction.
Lane detection algorithms require extremely low computational costs as an important part of autonomous driving. Due to heavy backbone networks, algorithms based on pixel-wise segmentation is struggling to handle the problem of runtime consumption in the recognition of lanes. In this paper, a novel and practical methodology based on lightweight Segmentation Network is proposed, which aims to achieve accurate and efficient lane detection. Different with traditional convolutional layers, the proposed Shadow module can reduce the computational cost of the backbone network by performing linear transformations on intrinsic feature maps. Thus a lightweight backbone network Shadow-VGG-16 is built. After that, a tailored pyramid parsing module is introduced to collect different sub-domain features, which is composed of both a strip pool module based on Pyramid Scene Parsing Network (PSPNet) and a convolution attention module. Finally, a lane structural loss is proposed to explicitly model the lane structure and reduce the influence of noise, so that the pixels can fit the lane better. Extensive experimental results demonstrate that the performance of our method is significantly better than the state-of-the-art (SOTA) algorithms such as Pointlanenet and Line-CNN et al. 95.28% and 90.06% accuracy and 62.5 frames per second (fps) inference speed can be achieved on the CULane and Tusimple test dataset. Compared with the latest ERFNet, Line-CNN, SAD, F1 scores have respectively increased by 3.51%, 2.84%, and 3.82%. Meanwhile, the result from our dataset exceeds the top performances of the other by 8.6% with an 87.09 F1 score, which demonstrates the superiority of our method.
Due to the influence of weather, light, season and angle changes, the appearance of objects changes visually, which makes it difficult for unmanned vehicles that rely on visual positioning to complete their positioning work. This paper proposes a coordinated positioning method that is composed of semantic information and geometric relationships distribution (GRD), which improves the robustness of unmanned vehicle location under the above conditions. First, we improved the FAST-SCNN semantic segmentation network and replaced its fully connected layer with the conv4-3 module to prevent the spatial information of the image from being lost in the fully connected layer. At the same time, the conv4-3 layer contains the richest semantic information, we use image semantic content to create a dense and prominent scene description. These prominent descriptions were learned from a large data set of perceptual changes. The method can accurately segment geometrically stable image regions. We combine the characteristics of these highlighted areas with the existing overall representation to produce a more robust scene descriptor. Second, a method is designed to integrate the matching of semantic labels and geometric distribution relations, which is a new closed loop location recognition label and landmark map. The geometric pair relationship between the ground marks is encoded as a continuous probability density function, the GRD function, which is expressed by a Laguerre polynomial and Fourier series basis expansion. This orthogonal basis representation allows for efficient computation of rotation and translation invariants, which are used to compare signatures and search for potential loop closure candidates. Finally, we evaluate our method with some of the most advanced algorithms, such as OpenSeqSLAM, AlexNet and VSO, to demonstrate its advantages. The experimental results for representative data sets show that the method based on Fast-SCNN is superior to other methods.
In vehicle detection and tracking methods based on roadside cameras, due to factors such as motion blur, vehicle occlusion, and target scale changes in the video, problems such as missed detection, false detection, and low positioning accuracy may occur during vehicle detection and tracking. This paper proposes to use the improved target detection model YOLOv5s as the target detector, then combined with the classical Deep SORT target tracking method, to achieve end-to-end vehicle detection and tracking. By integrating the attention mechanism with the detection network, and modifying the loss function, the ability of the model to extract features is strengthened, and the final vehicle detection accuracy is 96.7%. During tracking, the vehicle IDs are reduced to 28 times and the operating speed reaches 32 Hz.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.