Crowdsourcing provides a practical approach to obtaining annotated data for data-hungry deep models. Due to its simplicity and practicality, simultaneously learning the annotation correction mechanism and the target classifier is widely studied and applied. Existing work has improved performance from the annotator and annotation process modeling perspective. However, the instance representation, which most directly affects model training, has been neglected. In this work, we investigate contrastive representation to improve learning from crowds. Specifically, we first sample confident instances and positive pairs using the pre-trained representation and human annotations. Then, we extend the supervised contrastive loss to obtain a noise-tolerant version that supports continuous consistency between labels. After that, we leverage the learned representations to train the classifier and annotator parameters. The process is generally designed as an end-to-end framework, CrowdCons, compatible with existing crowdsourcing models. Our approach is evaluated on three real-world crowdsourcing datasets; LabelMe, CIFAR10-H and Music. The experimental results show that it can significantly improve prediction accuracy, and the case study demonstrates the robustness of the model regarding noisy annotations.