“…Most used non-invasive attentional cues in currently available vision-based systems are eyelid movements (e.g.,eye blink frequency, closure duration) [6]- [11], eye gaze [6]- [8], [12], head movement [7]- [9], [13], [14], facial expressions (e.g., yawning, lip movements etc.) [7], [11], [13], [15], and body movements (like hand movement) [15]- [17]. However, these existing vision-based systems have some limitations: (i) capturing sensors used by the aforementioned systems are either expensive camera(s) [7]- [10], [14], [15], [17] with any additional sensor/hardware [11], [12], [16], [17] or some specialized imaging sensor (e.g., eye tracker [6] and Kinect [13]); (ii) some of the systems used only single parameters such as pupil [12], PER-CLOS [10] and head pose [14] to estimate driver's attentional state making the system unable to adapt to some situations which are common in the real driving scenario (for example, turning head or wearing sun glass can hide eye) and resulting in incorrect attentional state detection; (iii) some existing systems detect single inattentional state [8]- [14], [17] or limited only to the level of the same state [6], whereas some other focus on detecting driver's activity [15], [16]; (iv) some systems do not have any alert system to warn the driver of any inattentional state when detected during driving [6]- [8], …”