2020
DOI: 10.1007/978-3-030-57675-2_37
|View full text |Cite
|
Sign up to set email alerts
|

cuDTW++: Ultra-Fast Dynamic Time Warping on CUDA-Enabled GPUs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 22 publications
0
7
0
Order By: Relevance
“…sDTW, on the other hand, is a data-reusing version of the approach, and our work exploits the fine-grain parallelism that computes the whole O ( M ) dimension in parallel, leaving O ( M + N ) computational time and O ( M ) space. Furthermore, there is prior work that accelerates DTW using nonvolatile memories [ 51 ] and using GPU acceleration [ 52 , 53 ].…”
Section: Discussionmentioning
confidence: 99%
“…sDTW, on the other hand, is a data-reusing version of the approach, and our work exploits the fine-grain parallelism that computes the whole O ( M ) dimension in parallel, leaving O ( M + N ) computational time and O ( M ) space. Furthermore, there is prior work that accelerates DTW using nonvolatile memories [ 51 ] and using GPU acceleration [ 52 , 53 ].…”
Section: Discussionmentioning
confidence: 99%
“…For intra-sub-matrix communication, we exploit warp shuffles for efficient register-to-register transfers within the same warp. This is an idea demonstrated by Schmidt et al 27 but not completely explored. Threads in a warp use warp shuffles to transfer the query sample, the minimum score of the segment, and the score of the last cell in the segment to the thread on its right.…”
Section: Dtwax: Architecture Figure 3 Efficient Intra-and Inter-matri...mentioning
confidence: 93%
“…DTWax can be reprogrammed to test for any target reference of interest. Unlike some of the prior works 4, 27 , DTWax can be reporgrammed to test for longer target references. Further, one may easily try and scale DTWax across multiple GPUs for higher throughput on longer or multiple target references.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations