Infection by SARS-CoV-2 involves the attachment of the receptor binding domain (RBD) of its spike proteins to the ACE2 receptors on the peripheral membrane of host cells. Binding is initiated by a down to up conformational change in the spike protein, an opening which presents the RBD to the receptor. To date, computational and experimental studies for therapeutics have concentrated, for good reason, on the RBD. However, the RBD region is highly prone to mutations, and therefore will possibly arise drug resistance. In contrast, we here focus on the correlations between the RBD and residues distant to it in the spike protein. We thereby provide a deeper understanding of the role of distant residues in the molecular mechanism of infection. Predictions of key mutations in distant allosteric binding sites are provided, with implications for therapeutics. Identifying these emerging mutants can also go a long way towards pre-designing vaccines for future outbreaks. The model we use, based on time-independent component analysis (tICA) and protein graph connectivity network, is able to identify multiple residues that exhibit long-distance coupling with the RBD opening. Mutation on these residues can lead to new strains of coronavirus with different degrees of transmissibility and virulence. The most ubiquitous D614G mutation and the A570D mutation of the highly contagious UK SARS-CoV-2 variant are predicted ab-initio from our model. Conversely, broad spectrum therapeutics like drugs and monoclonal antibodies can be generated targeting these key distant but conserved regions of the spike protein.Significance statementThe novel coronavirus (SARS-CoV-2) pandemic resulted in the largest economic and public health crises in recent times. Significant drug design effort, against SARS-CoV-2, is focused on the receptor binding domain (RBD) of the spike protein, although this region is prone to mutations causing therapeutic resistance. We applied deep data analysis methods on all-atom molecular dynamics simulations to identify key non-RBD residues that play a crucial role in spike-receptor binding and infection. Because the non-RBD residues are typically conserved across multiple coronaviruses, they can not only be targeted by broad spectrum antibodies and drugs against existing strains, but can also offer predictive insights into how to design adaptable antiviral therapeutic framework against spike of strains that might appear during future epidemics.