Two novel gesture-based Human-UAV Interaction (HUI) systems are proposed to launch and control a UAV in real-time utilizing a monocular camera and a ground computer. The first proposal is an endto-end static Gesture-Based Interaction (GBI) system based on classifying the interacted user poses while discarding the gesture interpreting component to boost the system performance up to 99 % with a speed of 28 fps. On the other hand, the second proposal is a dynamic one that adopts a simple model to detect three parts (face and two hands) of the interacted person and tracks them for a certain number of frames till a specific dynamic gesture is recognized. The proposed dynamic method is efficient, decreases the complexity, and speeds the interaction up to 27 fps comparing with the recent multimodel ones. Its backbone is a simplified Tiny-You Only Look Once (YOLO) network saves the resources and speeds the detection process up to 120 fps. Moreover, a comprehensive new gestures dataset was established to facilitate the learning process and aid the research work. A comparative study is carried out to show the performance and efficiency of the proposed dynamic HUI system in terms of detection accuracy and speed with the baseline detector on the human gesture public dataset. Finally, a non-expert volunteer examines the proposed HUIs by launching and driving a Bebop 2 micro UAV through a set of real flights. INDEX TERMS Gesture dataset, gesture recognition, human-machine interaction, object tracking.