“…Hand pose estimation is a long-standing question and several learning-based approaches have been introduced. These approaches generally involve predicting 3D keypoint locations [8,13,16,18,19,28,40,41,43,54,57,58,60,64,66,71], regressing MANO [51] parameters [1,2,4,25,26,68], or directly predicting the full dense surface of the hand [20,30,39,61]. The methods that directly predict 3D key points usually achieve better performance, however, they do not yield dense surface which is crucial for hand interaction.…”