Different individual features of the learner data often work as essential indicators of learning and intervention needs. This work exploits the personas in the design thinking process as the theoretical basis to analyze and cluster learners' learning behavior patterns as groups. To adapt to the learning practice, we develop data-driven personas by clustering learners' features based on factual learning outcomes (i.e., knowledge gain, perceived learning experience, perceived social presence) based on unsupervised learning, a more accessible and objective intervention design strategy for e-reading practices. Using the Chi-square test, we quantitatively evaluate different clusters driven by various unsupervised learning methods on the multimodal SKEP dataset. Furthermore, for a more practical real-life application, we achieved automatic persona prediction based on the attention regulation behaviors of learners. The subject-independent evaluation results indicate the best classification accuracy of 70% for the four-level classification task, differentiating three personas of learners with needs and another without feedback needs. It also shows that time-based sampling on both independent and cumulative learner behaviors works as robust predictors of learner personas, achieving a stable accuracy range of 65%-70% throughout the e-reading with the SVM classifier. Our work inspires the design of a real-time feedback loop for e-learning based on conversational agents.INDEX TERMS Data-driven persona development, human-robot interaction, instructional design, learning analytics, unsupervised learning.