Gesture input using the acceleration sensor of a smartphone is a promising new input method. The target input gestures in this paper are movements of a user's hand holding a smartphone. However, if parameter tuning is performed to improve the recognition accuracy of input gestures while stationary, erroneous detection at the start of walking will increase. On the other hand, if parameter tuning is performed to reduce false detection at the start of walking, the recognition accuracy of input gestures while stationary is lowered. Thus, there is a trade-off problem. In this paper, we propose a gesture recognition method to reduce erroneous recognition by combining a gesture detection method that uses similarity based on dynamic time warping (DTW) (TD) and a gesture classification method that also includes walking data as a candidate (CD). We conducted evaluation experiments with nine subjects. As a result, we confirmed that false detection at the start of walking can be eliminated using the proposed method. By verification using t-test, we confirmed that the F1-score of the proposed method was significantly higher than that of CD.