This paper considers the Vehicle Routing Problem with Soft Time Windows, a challenging routing problem, where customer's time windows may be violated at a certain cost. The Vehicle Routing Problem with Soft Time Windows has a lexicographic objective function, aiming at minimizing first the number of routes, then the number of violated time windows and finally the total routing distance. We present a multi-stage Very Large-Scale Neighborhood search for this problem. Each stage corresponds to a Variable Neighborhood Descent over a parametrizable Very Large-Scale Neighborhood. These neighborhoods contain an exponential number of neighbors, as opposed to classical local search neighborhoods. Often, searching Very Large-Scale Neighborhoods can produce local optima of a higher quality than polynomial-sized neighborhoods. Furthermore we use a sophisticated heuristic to determine service start times allowing to minimize the number of violated time windows. We test our approach on number of different problem types, and compare the results to the relevant state-of-the-art. The experimental results show that our algorithm improves best-known solutions on 53% of the most studied instances. Many of these improvements stem from a reduction of the number of vehicles, a critical objective in Vehicle Routing Problems.