Effective communication with a mobile robot using speech is a difficult problem even when you can control the auditory scene. Robot ego-noise, echoes, and human interference are all common sources of decreased intelligibility. In real-world environments, however, these common problems are supplemented with many different types of background noise sources. For instance, military scenarios might be punctuated by high decibel plane noise and bursts from weaponry that mask parts of the speech output from the robot. Even in non-military settings, however, fans, computers, alarms, and transportation noise can cause enough interference that they might render a traditional speech interface unintelligible. In this work, we seek to overcome these problems by applying robotic advantages of sensing and mobility to a textto-speech interface. Using perspective taking skills to predict how the human user is being affected by new sound sources, a robot can adjust its speaking patterns and/or reposition itself within the environment to limit the negative impact on intelligibility, making a speech interface easier to use.