Constructing efficient cellular factories
often requires integration
of heterologous pathways for synthesis of novel compounds and improved
cellular productivity. Few genomic sites are routinely used, however,
for efficient integration and expression of heterologous genes, especially
in nonmodel hosts. Here, a data-guided framework for informing suitable
integration sites for heterologous genes based on ATAC-seq was developed
in the nonmodel yeast
Komagataella phaffii
. Single-copy
GFP constructs were integrated using CRISPR/Cas9 into 38 intergenic
regions (IGRs) to evaluate the effects of IGR size, intensity of ATAC-seq
peaks, and orientation and expression of adjacent genes. Only the
intensity of accessibility peaks was observed to have a significant
effect, with higher expression observed from IGRs with low- to moderate-intensity
peaks than from high-intensity peaks. This effect diminished for tandem,
multicopy integrations, suggesting that the additional copies of exogenous
sequence buffered the transcriptional unit of the transgene against
effects from endogenous sequence context. The approach developed from
these results should provide a basis for nominating suitable IGRs
in other eukaryotic hosts from an annotated genome and ATAC-seq data.