This paper proposes a new method for lip animation of personalized facial model from auditory speech. It is based on Bayesian estimation and person specific appearance models (PSFAM). Initially, a video of a speaking person is recorded from which the visual and acoustic features of the speaker and their relationship will be learnt. First, the visual information of the speaker is stored in a color PSFAM by means of a registration algorithm. Second, the auditory features are extracted from the waveform attached to the recorded video sequence. Third, the relationship between the learnt PSFAM and the auditory features of the speaker is represented by Bayesian estimators. Finally, subjective perceptual tests are reported in order to measure the intelligibility of the preliminary results synthesizing isolated words.