Packet communication systems cannot, in general, guarantee accurate and prompt delivery of every packet. The effect of network congestion and transmission impairments on data packets is extended delay; in voice communications these problems lead to lost packets. This paper describes techniques for replacing missing speech with waveform segments from correctly received packets in order to increase the maximum tolerable missing packet rate.After presenting a simple formula for predicting the probability of waveform substitution failure as a function of packet duration and packet loss rate, we introduce two techniques for selecting substitution waveforms. One method is based on pattern matching and the other technique explicitly estimates voicing and pitch. Both approaches achieve substantial improvements in speech quality relative to silence substitution.
iNTRODUCTIONPacket speech communication is a new technique that may play an important role in the evolution of combined voice and data services, Although the advantages of communicating computer data in packets are well documented [1], the editors of a recent collection of papers report that "the jury is still out" on the merits of packet speech [2]. In contrast to packet data transmission where delays build up as traffic increases, speech communication requires prompt packet delivery.Beyond some time limit, delayed speech packets are useless at the receiving terminal and are discarded by the system. Packet loss, therefore, has a major effect on speech quality and the consequent constraints on packet dropping rates strongly affect system costs. In formal listening tests conducted to assess the effects of missing packets on speech quality [3,4], silent gaps replaced the missing speech packets, and it was determined that packet loss rates up to about one percent were tolerable. There are reports of other techniques for dealing with lost packets, such as repeating previous packets or, if the speech has been processed by a vocoder, synthesizing new speech from previously received analysis data [5]. Another approach is to construct speech packets at the transmitter in a manner that facilitates the recovery of lost packets [6].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.