Connected and Autonomous Vehicle (CAV)-related initiatives have become some of the fastest expanding in recent years, and have started to affect the daily lives of people. More and more companies and research organizations have announced their initiatives, and some have started CAV road trials. Governments around the world have also introduced policies to support and accelerate the deployments of CAVs. Along these, issues such as CAV cyber security have become predominant, forming an essential part of the complications of CAV deployment. There is, however, no universally agreed upon or recognized framework for CAV cyber security. In this paper, following the UK CAV cyber security principles, we propose a UML (Unified Modeling Language)-based CAV cyber security framework, and based on which we classify the potential vulnerabilities of CAV systems. With this framework, a new CAV communication cyber-attack data set (named CAV-KDD) is generated based on the widely tested benchmark data set KDD99. This data set focuses on the communication-based CAV cyber-attacks. Two classification models are developed, using two machine learning algorithms, namely Decision Tree and Naive Bayes, based on the CAV-KDD training data set. The accuracy, precision and runtime of these two models when identifying each type of communication-based attacks are compared and analysed. It is found that the Decision Tree model requires a shorter runtime, and is more appropriate for CAV communication attack detection.