Currently, the detection of targets using drone-mounted imaging equipment is a very useful technique and is being utilized in many areas. In this study, we focus on acoustic signal detection with a drone detecting targets where sounds occur, unlike image-based detection. We implement a system in which a drone detects acoustic sources above the ground by applying a phase difference microphone array technique. Localization methods of acoustic sources are based on beamforming methods. The background and self-induced noise that is generated when a drone flies reduces the signal-to-noise ratio for detecting acoustic signals of interest, making it difficult to analyze signal characteristics. Furthermore, the strongly correlated noise, generated when a propeller rotates, acts as a factor that degrades the noise source direction of arrival estimation performance of the beamforming method. Spectral reduction methods have been effective in reducing noise by adjusting to specific frequencies in acoustically very harsh situations where drones are always exposed to their own noise. Since the direction of arrival of acoustic sources estimated from the beamforming method is based on the drone’s body frame coordinate system, we implement a method to estimate acoustic sources above the ground by fusing flight information output from the drone’s flight navigation system. The proposed method for estimating acoustic sources above the ground is experimentally validated by a drone equipped with a 32-channel time-synchronized MEMS microphone array. Additionally, the verification of the sound source location detection method was limited to the explosion sound generated from the fireworks. We confirm that the acoustic source location can be detected with an error performance of approximately 10 degrees of azimuth and elevation at the ground distance of about 150 m between the drone and the explosion location.