High-quality, speaker-location-aware audio capturing has traditionally been realized using dedicated microphone arrays. But high cost and lack of portability prevents such systems from being widely adopted. Today's smartphones are relatively more convenient for audio recording, but the audio quality is much lower in noisy environment and speaker location cannot be readily obtained. In this paper, we design and implement Dia, which leverages smartphone cooperation to overcome the above limitations. Dia supports spontaneous setup, by allowing a group of users to rapidly assemble an array of smartphones to emulate a dedicated microphone array. It employs a novel framework to accurately synchronize the audio I/O clocks of the smartphones. The synchronized smartphone array further enables autodirective audio capturing, i.e., tracking the speaker's location, and beamforming the audio capturing towards the speaker to improve audio quality. We implement Dia on a testbed consisting of 8 Android phones. Our experiments demonstrate that Dia can synchronize the microphones of different smartphones with sample-level accuracy. It achieves high localization accuracy, and similar beamforming performance compared with a microphone array with perfect synchronization.