Unraveling sequence determinants which drive protein-RNA interaction is crucial for studying binding mechanisms and the impact of genomic variants. While CLIP-seq allows for transcriptome-wide profiling of in vivo protein-RNA interactions, it is limited to expressed transcripts, requiring computational imputation of missing binding information. Existing classification-based methods predict binding with low resolution and depend on prior labeling of transcriptome regions to obtain high-quality training sets. We present RBPNet, a novel deep learning method, which predicts CLIP crosslink count distribution from RNA sequence at single-nucleotide resolution. RBPNet performs bias correction by modeling the raw CLIP-seq signal as a mixture of the (unobserved) protein-specific and background signal obtained from control experiments. By training on up to a million regions with elevated signal, RBPNet achieves better generalization over state-of-the-art classifiers on a variety of assays, including eCLIP, iCLIP and miCLIP. Through model interrogation via Integrated Gradients, RBPNet identifies highly predictive sub-sequences corresponding to known binding motifs and enables variant-impact scoring via in silico mutagenesis. Together, RBPNet improves inference of protein-RNA interaction, as well as mechanistic interpretation of predictions, by modeling the raw CLIP-seq data at high resolution.