Curvature-inducing peripheral proteins have been observed to spontaneously remodel bilayer membranes, resulting in membrane invaginations and formation of membrane tubules. In case of proteins such as cholera and Shiga toxins that bend the membrane with locally isotropic curvatures, the resulting membrane-mediated interactions are rather small. Thus, the process in which these proteins form dense clusters on the membrane and collectively induce invaginations is extremely slow, progressing over several minutes. This makes it virtually impossible to directly simulate the pathway leading to membrane tubulation even with highly coarse-grained models. Here, we present a steered molecular dynamics protocol through which the peripheral proteins are forced to gather on a membrane patch and form a tubular invagination. Using thermodynamic integration, we obtain the free energy profile of this process and discuss its different stages. We show how protein stiffness, which also determines local membrane curvatures, affects the free energy landscape and the organization of proteins in the invaginated region. Furthermore, we estimate the kinetics of the described pathway modeled as a Markovian stochastic process, and compare the implied timescales with their experimental counterparts.