Gesture formation, a pre-processing step, has its importance when variations in patterns, scale, and speed come into play. Self co-articulations are intentional movements performed by an individual to complete a gesture, whose presence in the trajectory alters its original meaning. For recognition, most researchers have directly used the trajectory formed along with these self co-articulated strokes, with a few removing it using visible trait-like velocity. Usage of velocity has shortcomings as gesturing in air differs from gesturing over a solid surface; hence, we propose a gesture formation model, which incorporates global and local measures to remove these self co-articulations. The global measure uses Euclidean distance, instantaneous velocity, and polarity calculated from the complete gesture, while the local measure segments the gesture into stroke-level segments by using the minimummaximum-polarity algorithm and applies the selective bypass rules. The proposed model, when experimented on gestures patterns with premeditated speed variation, has a mean error rate of 0.0069 and 7.40% self co-articulations;individuals' natural gesticulation has a mean error rate of 0.0371 and 12.07% self co-articulations. Experimentation on each gesture of NITS hand gesture databases showed a relative improvement of 40% (accuracy 97%) over the existing baseline models.