Waiting is an indispensable and inevitable part of man-machine voice interaction. The voice user interface (VUI) feedback mechanism is a key factor affecting voice interaction's waiting experience. The feedback time of most available voice interfaces is fixed or decided by the processing time of hardware and software, which has not been designed and cannot offer users a good interaction experience. In this paper, the speech rate of user-machine voice interaction is collected through prototype experimentation. Besides, users' time perception of different voice interfaces' feedback time settings is studied based on time psychology theories. Moreover, users' emotional changes are described after a specific feedback time with the distribution of two-dimension arousal-valence emotion space. Users' time perception and subjective emotions are differently influenced by different VUI feedback times. The experimental results show that 750 ms is the optimal VUI feedback time point at which the best users' subjective feelings and psychological experiences are reached, and the threshold limit time spent by users in waiting for the VUI feedback is 1,850 ms which will lead to user emotions with low levels of arousal and valence after being exceeded. Based on that, a linear regression model is proposed to define the optimal feedback time of VUI. The user experience VUI research results show that the calculated feedback time parameters can make users produce time perception in line with their expectations in interacting with voice interfaces.INDEX TERMS Voice user interface, feedback time, time perception, speech rate.