We explore direct interaction of human and multi-rotor robots and its applications for entertainment. In this paper, we present our system that realises direct and multimodal interaction using onboard cameras and a microphone. With these onboard sensors to detect human actions, the robots' reaction chains and expands one after another. In addition, as all the processing is executed within the onboard computer, there is no need to use external devices. We describe its interaction scenario from take-off to landing, and present the pilot evaluation of our system.