Deep Hashing for Motion Capture Data Retrieval

Lv, Na; Wang, Ying; Feng, Zhiquan; Peng, Jingliang

doi:10.1109/icassp39728.2021.9413505

Cited by 10 publications

(4 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They also discuss a few future research directions for the management of large and diverse motion capture skeleton data. Lv et al [ 28 ] propose a hash-based convolution neural network where they extract deep features using the VGG16 network. They introduce the hash layer to create the hash code and, as a result, CNN is fine-tuned.…”

Section: Related Workmentioning

confidence: 99%

An Effective and Efficient Approach for 3D Recovery of Human Motion Capture Data

Yasin

Ghani

Krüger

2023

Sensors

View full text Add to dashboard Cite

In this work, we propose a novel data-driven approach to recover missing or corrupted motion capture data, either in the form of 3D skeleton joints or 3D marker trajectories. We construct a knowledge-base that contains prior existing knowledge, which helps us to make it possible to infer missing or corrupted information of the motion capture data. We then build a kd-tree in parallel fashion on the GPU for fast search and retrieval of this already available knowledge in the form of nearest neighbors from the knowledge-base efficiently. We exploit the concept of histograms to organize the data and use an off-the-shelf radix sort algorithm to sort the keys within a single processor of GPU. We query the motion missing joints or markers, and as a result, we fetch a fixed number of nearest neighbors for the given input query motion. We employ an objective function with multiple error terms that substantially recover 3D joints or marker trajectories in parallel on the GPU. We perform comprehensive experiments to evaluate our approach quantitatively and qualitatively on publicly available motion capture datasets, namely CMU and HDM05. From the results, it is observed that the recovery of boxing, jumptwist, run, martial arts, salsa, and acrobatic motion sequences works best, while the recovery of motion sequences of kicking and jumping results in slightly larger errors. However, on average, our approach executes outstanding results. Generally, our approach outperforms all the competing state-of-the-art methods in the most test cases with different action sequences and executes reliable results with minimal errors and without any user interaction.

show abstract

Section: Related Workmentioning

confidence: 99%

An Effective and Efficient Approach for 3D Recovery of Human Motion Capture Data

Yasin

Ghani

Krüger

2023

Sensors

View full text Add to dashboard Cite

show abstract

“…It is common for deep hashing to be applied in data retrieval for its advantages of a solid learning ability and good portability [3]. Meanwhile, deep learning to hash methods [4][5][6][7][8][9][10][11] try to convert high-dimensional media data into compact binary code via a hash function, and the data structure information is stored in the Hamming space. Therefore, deep hashing methods garner attention in image retrieval.…”

Section: Introductionmentioning

confidence: 99%

Deep Hash with Improved Dual Attention for Image Retrieval

et al. 2021

View full text Add to dashboard Cite

Recently, deep learning to hash has extensively been applied to image retrieval, due to its low storage cost and fast query speed. However, there is a defect of insufficiency and imbalance when existing hashing methods utilize the convolutional neural network (CNN) to extract image semantic features and the extracted features do not include contextual information and lack relevance among features. Furthermore, the process of the relaxation hash code can lead to an inevitable quantization error. In order to solve these problems, this paper proposes deep hash with improved dual attention for image retrieval (DHIDA), which chiefly has the following contents: (1) this paper introduces the improved dual attention mechanism (IDA) based on the ResNet18 pre-trained module to extract the feature information of the image, which consists of the position attention module and the channel attention module; (2) when calculating the spatial attention matrix and channel attention matrix, the average value and maximum value of the column of the feature map matrix are integrated in order to promote the feature representation ability and fully leverage the features of each position; and (3) to reduce quantization error, this study designs a new piecewise function to directly guide the discrete binary code. Experiments on CIFAR-10, NUS-WIDE and ImageNet-100 show that the DHIDA algorithm achieves better performance.

show abstract

“…Current research mainly focuses on recognizing classes of presegmented actions [5,10,12], detecting actions in a stream [15,23], or searching for query-relevant subsequences within a long motion [2,22]. These tasks often employ query-by-example retrieval as the underlying operation; e.g., in the subsequence search task, a long motion is usually partitioned into a large number of short motion segments that need to be effectively and efficiently matched against a user query.…”

Section: Introductionmentioning

confidence: 99%

“…Many existing retrieval techniques [2,18,19,24] focus solely on search quality and do not discuss the efficiency at all, which leads to expensive sequential scan over the whole dataset. The efficiencyoriented works either propose very compact features that allow fast sequential scanning [12,13], or utilize various indexing schemes to organize the motion data (e.g., the binary tree [25], kd tree [9], R* tree [4], inverted file index [14], or tries [8]). To optimize the efficiency-effectiveness trade-off, a two-phase retrieval model is often used, where the candidate objects identified within an efficient search phase are submitted to a re-ranking phase that refines the result using more expensive techniques (e.g., traversal of a graph structure [9] or ranking by the Dynamic Time Warping [14,20]).…”

Section: Introductionmentioning

confidence: 99%

Efficient Indexing of 3D Human Motions

Budikova

Sedmidubský

Zezula

2021

Proceedings of the 2021 International Conference on Multimedia Retrieval

View full text Add to dashboard Cite

Basic posting lists for dataset MWs MW matching indexFigure 1: Efficient skeleton-data retrieval. In a pre-processing phase, skeleton sequences are transformed into motion documents, i.e., compact text-like representations composed of structured motion words (MWs). The motion documents are organized using a new indexing scheme that extends the traditional inverted files. During query processing, candidate documents are efficiently retrieved by a proposed approximate search algorithm and finally re-ranked using the DTW alignment.

show abstract

Deep Hashing for Motion Capture Data Retrieval

Cited by 10 publications

References 19 publications

An Effective and Efficient Approach for 3D Recovery of Human Motion Capture Data

An Effective and Efficient Approach for 3D Recovery of Human Motion Capture Data

Deep Hash with Improved Dual Attention for Image Retrieval

Efficient Indexing of 3D Human Motions

Contact Info

Product

Resources

About