The concentration of volatile fatty acids (VFAs) is one of the most important measurements for evaluating the performance of anaerobic digestion (AD) processes. In real-time applications, VFAs can be measured by dedicated sensors, which are still currently expensive and very sensitive to harsh environmental conditions. Moreover, sensors usually have a delay that is undesirable for real-time monitoring. Due to these problems, data-driven soft sensors are very attractive alternatives. This study proposes different data-driven methods for estimating reliable VFA values. We evaluated random forest (RF), artificial neural network (ANN), extreme learning machine (ELM), support vector machine (SVM) and genetic programming (GP) based on synthetic data obtained from the international water association (IWA) Benchmark Simulation Model No. 2 (BSM2). The organic load to the AD in BSM2 was modified to simulate the behavior of an anaerobic co-digestion process. The prediction and generalization performances of the different models were also compared. This comparison showed that the GP soft sensor is more precise than the other soft sensors. In addition, the model robustness was assessed to determine the performance of each model under different process states. It is also shown that, in addition to their robustness, GP soft sensors are easy to implement and provide useful insights into the process by providing explicit equations.