2021
DOI: 10.48550/arxiv.2104.05485
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Predicting Pedestrian Crossing Intention with Feature Fusion and Spatio-Temporal Attention

Abstract: Predicting vulnerable road user behavior is an essential prerequisite for deploying Automated Driving Systems (ADS) in the real-world. Pedestrian crossing intention should be recognized in real-time, especially for urban driving. Recent works have shown the potential of using vision-based deep neural network models for this task. However, these models are not robust and certain issues still need to be resolved. First, the global spatio-temproal context that accounts for the interaction between the target pedes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 29 publications
0
6
0
Order By: Relevance
“…For j = 0 (the bottom level of the stack), x t 0 = c t p and for j > 0, x t j = h t−1 j + c t p . Meanwhile, inspired by [3,13], we introduced the attention mechanism [18] into GRU to form At-GRU (attention-GRU). The attention module can selectively focus on some features, so as to better deal with key objects.…”
Section: Model Constructionmentioning
confidence: 99%
See 1 more Smart Citation
“…For j = 0 (the bottom level of the stack), x t 0 = c t p and for j > 0, x t j = h t−1 j + c t p . Meanwhile, inspired by [3,13], we introduced the attention mechanism [18] into GRU to form At-GRU (attention-GRU). The attention module can selectively focus on some features, so as to better deal with key objects.…”
Section: Model Constructionmentioning
confidence: 99%
“…Pedestrian crossing behavior is affected by multiple factors, including road vehicles, surrounding pedestrians, crossing intentions, and current movement speed. With the development of computer vision, imagebased pedestrian behavior prediction has been widely studied [3]. Early studies mostly Sensors 2022, 22, 1467 2 of 18 used single-frame picture as input into convolutional neural network (CNN) for prediction [4].…”
Section: Introductionmentioning
confidence: 99%
“…scenes, trajectories, poses and ego-vehicle speed, and the learning architecture used to infer a crossing prediction, e.g. RNN-based models [23], [24], [25], [1], [26] or Transformer-based models [27], [28].…”
Section: A Pedestrian Crossing Predictionmentioning
confidence: 99%
“…In its first year of existence, proposed approaches evaluated on the benchmarks [1] constantly report higher classification scores [19], [16], [26], [29], [30], [31], giving the impression of clear improvements in pedestrian intention prediction. Usually, a new algorithm is proposed and the implicit hypothesis towards the proposed contribution is made such that it yields an improved performance over the existing state-ofthe-art.…”
Section: B Cross-dataset Evaluationmentioning
confidence: 99%
“…More recent works extend this anticipation time to different values with different observation lengths, in addition to developing more advanced models with multiple input features [2,3,4]. In [15], an evaluation benchmark is proposed to tackle this problem, also adopted in [16,17,18,19]. This benchmark focuses on predicting future intentions between 1.0 to 2.0 s earlier the event and uses overlapping windows of 0.5 s as motion history.…”
Section: Introductionmentioning
confidence: 99%