Integration of machine learning and high throughput measurements are essential to drive the next generation of the design-build-test-learn (DBTL) cycle in synthetic biology. Here, we report the use of active learning in combination with metabolomics for optimising production of surfactin, a complex lipopeptide resulting from a non-ribosomal assembly pathway. We designed a media optimisation algorithm that iteratively learns the yield landscape and steers the media composition toward maximal production. The algorithm led to a 160% yield increase after three DBTL runs as compared to an M9 baseline. Metabolomics data helped to elucidate the underpinning biochemistry for yield improvement and revealed Pareto-like trade-offs in production of other lipopeptides from related pathways. We found positive associations between organic acids and surfactin, suggesting a key role of central carbon metabolism, as well as system-wide anisotropies in how metabolism reacts to shifts in carbon and nitrogen levels. Our framework offers a novel data-driven approach to improve yield of biological products with complex synthesis pathways that are not amenable to traditional yield optimisation strategies.