2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2021
DOI: 10.1109/cvprw53098.2021.00477
|View full text |Cite
|
Sign up to set email alerts
|

Keyword-based Vehicle Retrieval

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 19 publications
0
5
0
Order By: Relevance
“…These approaches are well‐suited for single animated objects. Time can be fed as a latent representation to the neural radiance field, e.g., by simple concatenation [LSZ*22], as 4D spatiotemporal positional encoding [PSJ*23] or by lifting time to higher dimensions using additional networks [YJM*23,FYW*22,PSJ*23]. Additionally, one can further reduce the dimensionality of the latent space using tensor decomposition, notably 2D‐2D [SZT*23] and 3D‐1D [IRG*23].…”
Section: Related Workmentioning
confidence: 99%
“…These approaches are well‐suited for single animated objects. Time can be fed as a latent representation to the neural radiance field, e.g., by simple concatenation [LSZ*22], as 4D spatiotemporal positional encoding [PSJ*23] or by lifting time to higher dimensions using additional networks [YJM*23,FYW*22,PSJ*23]. Additionally, one can further reduce the dimensionality of the latent space using tensor decomposition, notably 2D‐2D [SZT*23] and 3D‐1D [IRG*23].…”
Section: Related Workmentioning
confidence: 99%
“…We compare our OMG with previous state-of-the-art methods in Table 3. It is shown that our Team MRR OMG(ours) 0.3012 Alibaba-UTS-ZJU [1] 0.1869 SDU-XidianU-SDJZU [38] 0.1613 SUNYKorea [33] 0.1594 Sun Asterisk [30] 0.1571 HCMUS [31] 0.1560 TUE [37] 0.1548 JHU-UMD [14] 0.1364 Modulabs-Naver-KookminU [15] 0.1195 Unimore [36] 0.1078…”
Section: Evaluation Resultsmentioning
confidence: 99%
“…Tien-Phat et al [31] adapts COOT [8] to model the cross-modal relationships with both appearance and motion attributes. Eun-Ju et al [33] propose to perform color and type classification for both target and front-rear vehicles, and conduct movement analysis based on the Kalman filter algorithm [13]. DUN [38] uses pretrained CNN and GloVe [34] to extract modal-specific features and GRUs [3] to exploit temporal information.…”
Section: Text-based Vehicle Retrievalmentioning
confidence: 99%
“…In the 5th NVIDIA AI City Challenge, the majority of teams [2], [16] [17], [18] [19], [20] chose to extract sentence embeddings of the queries, whereas two teams [21], [22] processed the NL queries using conventional NLP techniques. For cross-modality learning, certain teams [20], [2] used ReID models with the adoption of vision models pre-trained on visual ReID data and language models pre-trained on the given queries from the dataset.…”
Section: Related Work a Natural Language-based Vehicle-based Video Re...mentioning
confidence: 99%
“…The motion of vehicles is an integral component of the NL descriptions. Consequently, a number of teams [2], [18], [22] have developed specific methods for measuring and representing vehicle motion patterns.…”
Section: Related Work a Natural Language-based Vehicle-based Video Re...mentioning
confidence: 99%