The controller area network (CAN) protocol, used in many modern vehicles for real-time inter-device communications, is known to have cybersecurity vulnerabilities, putting passengers at risk for data exfiltration and control system sabotage. To address this issue, researchers have proposed to utilize security measures based on cryptography and message authentication; unfortunately, such approaches are often too computationally expensive to be deployed in real-time on CAN devices. Additionally, they have developed machine learning (ML) techniques to detect anomalies in CAN traffic, and thereby prevent attacks. The main disadvantage of existing ML-based techniques is that they either depend on additional computational hardware, or they heuristically assume that all communication anomalies are malicious.
In this paper, we show that tree-based learning ensembles outperform anomaly-based techniques like ARIMA and Z-Score when used to detect attacks that result in increased bus utilization. We evaluated the detection capacity of three tree-based ensembles, Adaboost, gradient boosting, and random forests, and collectively refer to these as DT-DS. We conclude that the decision tree ensemble with Adaboost performs best with an area under curve (AUC) score of 0.999, closely followed by gradient boosting and random forests with 0.997 and 0.991 AUC scores, respectively, when trained using message profiles. We observe that with an increase in the observation window, the DT-DS models present an average AUC score of 0.999, and offer a nearly perfect detection of attacks, at the cost of increased latency in detection of attacked messages. We evaluate the performance of the IDS for ARINC-encoded CAN communication traffic in avionic systems, generated using an aerospace testbench, ARINC-825TBv2. The IDS has been evaluated against the active attacks of a state-of-the-art predictive attacker model. Additionally, we observed that the performance of IDS approaches such as ARIMA and Z-Score degrade considerably with a decrease in the size of the observation time window. In contrast, the performance of DT-DS models is consistent, with only an average drop of 0.005 in the AUC score.