Mobile phones are equipped with an increasingly large number of precise and sophisticated sensors. This raises the risk of direct and indirect privacy breaches. In this paper, we investigate the feasibility of keystroke inference when user taps on a soft keyboard are captured by the stereoscopic microphones on an Android smartphone. We developed algorithms for sensor-signals processing and domain specific machine learning to infer key taps using a combination of stereo-microphones and gyroscopes. We implemented and evaluated the performance of our system on two popular mobile phones and a tablet: Samsung S2, Samsung Tab 8 and HTC One. Based on our experiments, and to the best of our knowledge, our system (1) is the first to exceed 90% accuracy requiring a single attempt, (2) operates on the standard Android QWERTY and number keyboards, and (3) is language agnostic. We show that stereo-microphones are a much more effective side channel as compared to the gyroscope, however, their data can be combined to boost the accuracy of prediction. While previous studies focused on larger key sizes and repetitive attempts, we show that by focusing on the specifics of the keyboard and creating machine learning models and algorithms based on keyboard areas combined with adequate filtering, we can achieve an accuracy of 90% -94% for much smaller key sizes in a single attempt. We also demonstrate how such attacks can be instrumentalized by a malicious application to log the keystrokes of other sensitive applications. Finally, we describe some techniques to mitigate these attacks.