The use of spontaneous speech as a form of communication between humans and robots is a potential solution for more efficient human-robot interactions. Accuracy is one of the main problems associated with the automatic speech recognition (ASR) component of human-robot interactive systems. The standard ASR approach is based on statistical methods applied to phoneme domains. However, some problems cannot be solved with the rule-based approaches used so far; therefore, alternative strategies could be the solution. The aim of this paper is to investigate some aspects related to the use of a robot's perceptive abilities to increase the robustness of ASR components. The robot evaluative abilities are used to incrementally build knowledge that will be used during the recognition phase. This paper covers aspects concerning the use of time-warping algorithms to improve the speech recognition performance. In particular, aspects related to the accuracy and efficiency of this approach when applied to whole-sentence speech signals are discussed.