Segmentation of audio data such as human speech (splitting each word in separate audio file-.WAV file) has been a major concern when working with multimedia such as recordings from radio or TV. The main focus of the segmentation of boundaries of spoken language has been on using energy and zero crossing thresholds for endpoint detection. Errors in endpoint detection are still a main cause of low accuracy of segmentation systems. The goal of this research is to develop an efficient algorithm in order to segment the speech of human in both languages of English and Arabic in different speaking speed with high accuracy. Simulation results show that the developed algorithm achieved high accuracy when segmenting human speech in English language up to 91.6% in average, while it is 89.0% of Arabic language.