Naturally occurring
and engineered flavin-binding, blue-light-sensing,
light, oxygen, voltage (LOV) photoreceptor domains have been used
widely to design fluorescent reporters, optogenetic tools, and photosensitizers
for the visualization and control of biological processes. In addition,
natural LOV photoreceptors with engineered properties were recently
employed for optimizing plant biomass production in the framework
of a plant-based bioeconomy. Here, the understanding and fine-tuning
of LOV photoreceptor (kinetic) properties is instrumental for application.
In response to blue-light illumination, LOV domains undergo a cascade
of photophysical and photochemical events that yield a transient covalent
FMN-cysteine adduct, allowing for signaling. The rate-limiting step
of the LOV photocycle is the dark-recovery process, which involves
adduct scission and can take between seconds and days. Rational engineering
of LOV domains with fine-tuned dark recovery has been challenging
due to the lack of a mechanistic model, the long time scale of the
process, which hampers atomistic simulations, and a gigantic protein
sequence space covering known mutations (combinatorial challenge).
To address these issues, we used machine learning (ML) trained on
scarce literature data and iteratively generated and implemented experimental
data to design LOV variants with faster and slower dark recovery.
Over the three prediction–validation cycles, LOV domain variants
were successfully predicted, whose adduct-state lifetimes spanned
7 orders of magnitude, yielding optimized tools for synthetic (opto)biology.
In summary, our results demonstrate ML as a viable method to guide
the design of proteins even with limited experimental data and when
no mechanistic model of the underlying physical principles is available.