Hand Tracking in 3D Space using MediaPipe and PnP Method for Intuitive Control of Virtual Globe

Chunduru, Vaishnav; Roy, Mrinalkanti; S, Dasari Romit N.; Chittawadigi, Rajeevlochana G.

doi:10.1109/r10-htc53172.2021.9641587

Cited by 25 publications

(9 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…This framework excels at representation learning and applying it to object recognition and tracking applications. We employ MediaPipe [30] hand tracking to obtain estimated hand posture data. A construction designed specifically for complex perceptual channels that use rapid real-time inference.…”

Section: Microscopic Level: Feature Extraction Abilitymentioning

confidence: 99%

Hand–object interaction recognition based on visual attention using multiscopic cyber-physical-social system

Besari¹,

Saputra

Chin

et al. 2023

Int. J. Adv. Intell. Informatics

View full text Add to dashboard Cite

Computer vision-based cyber-physical-social systems (CPSS) are predicted to be the future of independent hand rehabilitation. However, there is a link between hand function and cognition in the elderly that this technology has not adequately supported. To investigate this issue, this paper proposes a multiscopic CPSS framework by developing hand–object interaction (HOI) based on visual attention. First, we use egocentric vision to extract features from hand posture at the microscopic level. With 94.87% testing accuracy, we use three layers of graph neural network (GNN) based on hand skeletal features to categorize 16 grasp postures. Second, we use a mesoscopic active perception ability to validate the HOI with eye tracking in the task-specific reach-to-grasp cycle. With 90.75% testing accuracy, the distance between the fingertips and the center of an object is used as input to a multi-layer gated recurrent unit based on recurrent neural network architecture. Third, we incorporate visual attention into the cognitive ability for classifying multiple objects at the macroscopic level. In two scenarios with four activities, we use GNN with three convolutional layers to categorize some objects. The outcome demonstrates that the system can successfully separate objects based on related activities. Further research and development are expected to support the CPSS application in independent rehabilitation.

show abstract

Section: Microscopic Level: Feature Extraction Abilitymentioning

confidence: 99%

Hand–object interaction recognition based on visual attention using multiscopic cyber-physical-social system

Besari¹,

Saputra

Chin

et al. 2023

Int. J. Adv. Intell. Informatics

View full text Add to dashboard Cite

show abstract

“…Which utilized to identify and track the positions of 33 skeleton points of body joint landmarks from RGB inputs under the real-time proceeding speed. Recently, many researchers utilized this tool for their active research [8][9][10]. The pipeline initially locates the region-of-interest (ROI) inside of the frame using a detector.…”

Section: Introductionmentioning

confidence: 99%

Improvement of Human Pose Estimation and Processing With the Intensive Feature Consistency Network

et al. 2023

View full text Add to dashboard Cite

The modeling of human body kye-points is the most significant aspect of pose estimation appropriately. Computer vision algorithm identifies human pose, body-movement, and action in many ways. Most of the previous works taken advantage for finding accuracy or efficiency in terms of speed. However, many techniques suffer for intensive computational demands with low-latency or higher proceeding speed. We have designed a unique approach for single-person pose estimation and action recognition which is well suited for fitness application and mobility activities. The proposed framework has been developed with a base network that provides an initial pose to further refinement through Intensive Feature Consistency (IFC) network. The IFC network enforces high-level constraints on the global body intensity correction and local body part adjustments. The proposed module reduces the impact of body joint movement diversity by interpreting long-term consistent view. We have illustrated the effectiveness of proposed framework through pose estimation accuracy improvement with two benchmark datasets. Which is specified state-of the-art performance of IFC network under the required real-time processing speed on the CPU platform. The IFC network has improved 99.1% of PCK body and 94.7% of PCK torso accuracy under 31 FPS, which is comparatively higher than the existing work.INDEX TERMS Single person pose estimation, intensive feature consistency, global body intensity, local part adjustments, skeleton joint key-points.

show abstract

“…In addition, gross motor tracking accuracy using RGB-depth cameras and Mediapipe has been validated with low errors for lower limb movements in running [28] and stationary cycling [29], as well as in hip, knee, shoulder and elbow joint movements [30], but poor correlation with ground truth data is reported for ankle joint movements [29]. Hand skill motor assessment using similar depth sensing setups and MediaPipe yielded optimal results with errors lower than 1 cm [20], but disturbances in hand trajectories were reported in Chunduru et al [31]. Fine upper-limb movements differ from these applications due to a large diversity in parameters relating to dexterity, speed, occlusions and overlaps that occur during movement and lower contrast patterns between individual features as compared to gross upper limb movements.…”

Section: Introductionmentioning

confidence: 99%

Quantifying similarities between MediaPipe and a known standard for tracking 2D hand trajectories

Wagh,

Scott,

Kraeutner

2023

Preprint

View full text Add to dashboard Cite

Marker-less motion tracking methods have promise for use in a range of domains, including clinical settings where traditional marker-based systems for human pose estimation is not feasible. MediaPipe is an artificial intelligence-based system that offers a markerless, lightweight approach to motion capture, and encompasses MediaPipe Hands, for recognition of hand landmarks. However, the accuracy of MediaPipe for tracking fine upper limb movements involving the hand has not been explored. Here we aimed to evaluate 2-dimensional accuracy of MediaPipe against a known standard. Participants (N = 10) performed trials in blocks of a touchscreen-based shape-tracing task. Each trial was simultaneously captured by a video camera. Trajectories for each trial were extracted from the touchscreen and compared to those predicted by MediaPipe. Specifically, following re-sampling, normalization, and Procrustes transformations, root mean squared error (RMSE; primary outcome measure) was calculated for coordinates generated by MediaPipe vs. the touchscreen computer. Resultant mean RMSE was 0.28 +/-0.064 normalized px. Equivalence testing revealed that accuracy differed between MediaPipe and the touchscreen, but that the true difference was between 0-0.30 normalized px (t(114) = -3.02,p= 0.002). Overall, we quantify similarities between MediaPipe and a known standard for tracking fine upper limb movements, informing applications of MediaPipe in a domains such as clinical and research settings. Future work should address accuracy in 3-dimensions to further validate the use of MediaPipe in such domains.

show abstract

Hand Tracking in 3D Space using MediaPipe and PnP Method for Intuitive Control of Virtual Globe

Cited by 25 publications

References 11 publications

Hand–object interaction recognition based on visual attention using multiscopic cyber-physical-social system

Hand–object interaction recognition based on visual attention using multiscopic cyber-physical-social system

Improvement of Human Pose Estimation and Processing With the Intensive Feature Consistency Network

Quantifying similarities between MediaPipe and a known standard for tracking 2D hand trajectories

Contact Info

Product

Resources

About