The authors demonstrate the feasibility of quantifying cell-level performance heterogeneity from module-level I–V curves by determining conditions of bypass diode turn-on. Analysis of these curves falls outside of typical diode-based models of photovoltaic (PV) performance. The authors show that this approach can leverage statistical and machine learning techniques for broad application to massive datasets, and combine those insights with simulations and laboratory-based experiments to provide useful information into the metastability of the interfaces of a PV cell. The authors find good agreement between the experimentally determined curves and the simulated curves, which guide the variable selection in the massive dataset collected from sites in Cleveland, OH, USA, the Negev Desert, Israel, Isla Gran Canaria, Spain, and Mount Zugspitze, Germany.