Existing methods for scoring student presentations predominantly rely on computer-based implementations and do not incorporate a robotic multi-classification model. This limitation can result in potential misclassification issues as these approaches lack active feature learning capabilities due to fixed camera positions. Moreover, these scoring methods often solely focus on facial expressions and neglect other crucial factors, such as eye contact, hand gestures and body movements, thereby leading to potential biases or inaccuracies in scoring. To address these limitations, this study introduces Robotics-based Presentation Skill Scoring (RPSS), which employs a multi-model analysis. RPSS captures and analyses four key presentation parameters in real time, namely facial expressions, eye contact, hand gestures and body movements, and applies the fuzzy Delphi method for criteria selection and the analytic hierarchy process for weighting, thereby enabling decision makers or managers to assign varying weights to each criterion based on its relative importance. RPSS identifies five academic facial expressions and evaluates eye contact to achieve a comprehensive assessment and enhance its scoring accuracy. Specific sub-models are employed for each presentation parameter, namely EfficientNet for facial emotions, DeepEC for eye contact and an integrated Kalman and heuristic approach for hand and body movements. The scores are determined based on predefined rules. RPSS is implemented on a robot, and the results highlight its practical applicability. Each sub-model is rigorously evaluated offline and compared against benchmarks for selection. Real-world evaluations are also conducted by incorporating a novel active learning approach to improve performance by leveraging the robot’s mobility. In a comparative evaluation with human tutors, RPSS achieves a remarkable average agreement of 99%, showcasing its effectiveness in assessing students’ presentation skills.