Clean energy sources like wind energy have been receiving much attention, and great emphasis has been given to the design and optimization of horizontal axis wind turbines, but just as important are the vertical axis wind turbines that can be used for generating energy for small businesses, houses, and buildings. This article sought to study the optimal geometrical parameters of a H-Darrieus vertical axis wind turbine using surrogate-based optimization with three different types of surrogate models and compared them. Airfoil chord and thickness were chosen as the design variables and respective ranges set at 0.32-0.6 m and 0.04-0.16 m. All evaluations are carried out for a tip-speed ratio of 1.5. Three different surrogate models were used and compared, namely a quadratic polynomial response surface, an artificial neural network based on radial basis functions called Extreme Learning Machine and a Kriging interpolator. Surrogates were constructed based on an initial sample data distributed according to a full factorial design. A test set was designed to evaluate the accuracy of the surrogates. Both training and testing data sets were generated using 2D CFD modeling to reduce computational cost. From the test set, Extreme Learning Machine surrogate showed the smallest RMSE of 11.24%, followed by Kriging, at 17.64%, and Response Surface of 22.17%. For the optimal designs the same pattern ensued, with optimal power coefficient overestimated by 8.7% for the response surface surrogate, followed by 3.12% and 2.17% for the Kriging interpolator and the Extreme Learning Machine, respectively. Power coefficient curves comparing the three optimal geometries from each surrogate were calculated and plotted. Optimal turbine obtained from Kriging surrogate optimization process resulted in a 7.92% increase in the Cp, whilst Extreme Learning Machine and Response Surface resulted in a 7.86% and 4.29% increase, respectively, all when compared to baseline CFD model. Concluding guidelines are that the quadratic polynomial response surface may not be the best alternative when dealing with complex non-linear relationships as typically present in VAWT simulations. Superior techniques such as Extreme Learning Machine and Kriging could be more suitable for this application.