High-throughput methodologies have enabled routine generation of RNA target sets and sequence motifs for RNA-binding proteins (RBPs). Nevertheless, quantitative approaches are needed to capture the landscape of RNA/RBP interactions responsible for cellular regulation. We have used the RNA-MaP platform to directly measure equilibrium binding for thousands of designed RNAs and to construct a predictive model for RNA recognition by the human Pumilio proteins PUM1 and PUM2. Despite prior findings of linear sequence motifs, our measurements revealed widespread residue flipping and instances of positional coupling. Application of our thermodynamic model to published in vivo crosslinking data reveals quantitative agreement between predicted affinities and in vivo occupancies. Our analyses suggest a thermodynamically driven, continuous Pumilio binding landscape that is negligibly affected by RNA structure or kinetic factors, such as displacement by ribosomes. This work provides a quantitative foundation for dissecting the cellular behavior of RBPs and cellular features that impact their occupancies. v8.0 3
Results
Library designStarting with the PUM2 consensus motif, which has been determined by pull-down, crosslinking and in vitro selection experiments ( Figures 1A, S1A), we designed an oligonucleotide v8.0 5 library to systematically address the factors that determine binding specificity ( Figure S1B). To control for structural and context effects, each sequence variant was embedded in two to four scaffolds ( Figure 1B). We systematically varied the sequence of the PUM2 binding site and the flanking sequence ( Figure S1B). We also included insertions to test the potential for noncontiguous binding sites and variants of sequence motifs of related PUF proteins to provide additional sequence variation for testing PUM2 binding models ( Figure S1B).
Massively parallel measurements of PUM2 binding affinitiesUsing RNA-MaP, we determined PUM2 protein binding affinities for >20,000 distinct RNAs and we report on >5000 herein; sequences designed to address distinct questions will be reported separately. The DNA library was sequenced on an Illumina MiSeq flow cell, followed by in situ transcription in a custom-built imaging and fluidics setup ( Figure 1C; ). The RNA transcripts were immobilized by stalling the RNA polymerase at the end of the DNA template, and RNA-protein association was measured by equilibrating the RNA with increasing concentrations of fluorescently labeled protein and imaging binding to each cluster (comprising ~1000 copies of an RNA variant) (Buenrostro et al., 2014)( Figure 1C). The resulting binding curves were used to obtain the dissociation constant (KD) and the corresponding ∆G value (= RTlnKD) of the protein for each RNA variant.