We present an efficient training approach to text retrieval with dense representations that applies knowledge distillation using the Col-BERT late-interaction ranking model. Specifically, we propose to transfer the knowledge from a bi-encoder teacher to a student by distilling knowledge from ColBERT's expressive MaxSim operator into a simple dot product. The advantage of the bi-encoder teacherstudent setup is that we can efficiently add inbatch negatives during knowledge distillation, enabling richer interactions between teacher and student models. In addition, using Col-BERT as the teacher reduces training cost compared to a full cross-encoder. Experiments on the MS MARCO passage and document ranking tasks and data from the TREC 2019 Deep Learning Track demonstrate that our approach helps models learn robust representations for dense retrieval effectively and efficiently. * Contributed equally.The standard reranker architecture, while effective, exhibits high query latency, on the order of seconds per query (Hofstätter and Hanbury, 2019; Khattab and Zaharia, 2020) because expensive neural inference must be applied at query time on query-passage pairs. This design is known as a cross-encoder (Humeau et al., 2020), which exploits query-passage attention interactions across all transformer layers. As an alternative, a biencoder design provides an approach to ranking with dense representations that is far more efficient than cross-encoders (