SummaryFor several years, alternative speech communication techniques have been examined, which are based solely on the articulatory muscle signals instead of the acoustic speech signal. Since these approaches also work with completely silent articulated speech, several advantages arise: the signal is not corrupted by background noise, bystanders are not disturbed, as well as assistance to people who have lost their voice, e.g. due to accident or due to disease of the larynx.The general objective of this work is the design, implementation, improvement and evaluation of a system that uses surface electromyographic (EMG) signals and directly synthesizes an audible speech output: EMG-to-speech. The electrical potentials of the articulatory muscles are recorded by small electrodes on the surface of the face and neck. An analysis of these signals allows interpretations on the movements of the articulatory apparatus and in turn on the spoken speech itself.An approach for creating an acoustic signal from the EMG-signal is the usage of techniques from automatic speech recognition. Here, a textual output is produced, which in turn is further processed by a text-to-speech synthesis component. However, this approach is difficult resulting from challenges in the speech recognition part, such as the restriction to a given vocabulary or recognition errors of the system. This thesis investigates the possibility to convert the recorded EMG signal directly into a speech signal, without being bound to a limited vocabulary or other limitations from an speech recognition component. Different approaches for the conversion are being pursued, real-time capable systems are implemented, evaluated and compared.For training a statistical transformation model, the EMG signals and the acoustic speech are captured simultaneously and relevant characteristics in terms of features are extracted. The acoustic speech data is only required as a reference for the training, thus the actual application of the transformation can take place using solely the EMG data. A feature mapping is accomplished ii Summary by a model that estimates the relationship between muscle activity patterns and speech sound components. From the speech components the final audible voice signal is synthesized. This approach is based on a source-filter model of speech: The fundamental frequency is overlaid with the spectral information (Mel Cepstrum), which reflects the vocal tract, to generate the final speech signal.To ensure a natural voice output, the usage of the fundamental frequency for prosody generation is of great importance. To bridge the gap between normal speech (with fundamental frequency) and silent speech (no speech signal at all), whispered speech recordings are investigated as an intermediate step.In whispered speech no fundamental frequency exists and accordingly the generation of prosody is possible, but difficult.This thesis examines and evaluates the following three mapping methods for feature conversion:1. Gaussian Mapping: A statistical method that trai...