We propose a novel tapstroke inference attack method, called TapSnoop, that precisely recovers what user types on touchscreen devices. Inferring tapstrokes is challenging owing to 1) low tapstroke intensity and 2) dynamically-changing noise. We address these challenges by revealing the unique characteristics of tapstrokes from audio recordings exploited by TapSnoop as a side channel of tapstrokes. In particular, we develop tapstroke detection and localization algorithms that collectively leverage audio features obtained from multiple microphones, which are designed to reflect the core properties of tapstrokes. Furthermore, we improve its robustness against environmental changes, by developing environment-adaptive classification and noise subtraction algorithms. Extensive experiments with ten real-world users on both number and QWERTY keyboards show that TapSnoop can achieve an inference accuracy of 85.4% and 75.6% (96.2% and 90.8% in best case scenarios) in stable environments, respectively. TapSnoop can also achieve a reasonable accuracy even with varying noise. For example, it shows an inference accuracy of 84.8% and 72.7% in a numeric keyboard when the noise level is varied from 37.9 to 51.2 dBA and 46.7 to 60.0 dBA, respectively. INDEX TERMS Acoustic signal processing, acoustic sensors, mobile computing, privacy, side-channel attack, tapstroke inference.