Mobile or handheld augmented reality uses a smartphone's live video stream and enriches it with superimposed graphics. In such scenarios, tracking one's fingers in front of the camera and interpreting these traces as gestures offers interesting perspectives for interaction. Yet, the lack of haptic feedback provides challenges that need to be overcome. We present a pilot study where three types of feedback (audio, visual, haptic) and combinations thereof are used to support basic finger-based gestures (grab, release). A comparative study with 26 subjects shows an advantage in providing combined, multimodal feedback. In addition, it suggests high potential of haptic feedback via phone vibration, which is surprising given the fact that it is held with the other, non-interacting hand.