Eye trackers are currently used to sense the positions of both the centers of the pupils and the point-of-gaze (POG) position on a screen, in keeping with the original objective for which they were designed; however, it remains difficult to measure the positions of three-dimensional (3D) POGs. This paper proposes a method for 3D gaze estimation by using head movement, pupil position data, and POGs on a screen. The method assumes that a person, usually unintentionally, moves his or her head a short distance such that multiple straight lines can be drawn from the center point between the two pupils to the POG. When the person is continuously focusing on a given 3D POG while moving, these lines represent the lines of sight that intersect at a 3D POG . That 3D POG can, therefore, be found from the intersection of several lines of sight formed by head movements. To evaluate the performance of the proposed method, experimental equipment was constructed, and experiments with five male and five female participants were performed in which the participants looked at nine test points in a 3D space for approximately 20 s each. The experimental results reveal that the proposed method can measure 3D POGs with average distance errors of 13.36 cm, 7.58 cm, 5.72 cm, 3.97 cm, and 3.52 cm for head movement distances of 1 cm, 2 cm, 3 cm, 4 cm, and 5 cm, respectively.INDEX TERMS 3D gaze estimation, gaze tracking, eye tracker.
Since Thai final consonant is unique comparing with other languages and plays key role in recognizing the Thai syllables, segmentation of the final consonant phoneme from the vowel is needed and capable of decreasing the amount of recognition patterns and also improving the recognition accuracy. This paper presents a technique to separate the final consonant phoneme from Thai syllable by exploiting the vowel characteristics and Wavelet packet transform. In this method, ending of the vowel phoneme (starting of the final consonant) is considered by vowel characteristic, which has the highest energy in the syllable. The frequency range having this qualification is selected as vowels. It is then employed to determine the filter for vowel signal. The Wavelet packet transform that is appropriate for discriminating vowel (high frequency and long period) from final consonant phoneme (low frequency and short period) is used as the filter. And the ending of vowels frequency signal component is considered to be the segmentation point of the final consonant. The experiments have been performed by 4,350 samples of syllable recorded from 15 males and 15 females. The experimental results gained the 92.89 % accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.