This study focuses on self-confidence of the second language (L2) learners' speaking. The L2 selfconfidence is important because it affects L2 competence and the L2 willingness to communicate. The confidence has been measured by questionnaires or interviews, but they are not suitable to measure dynamic confidence frequently. Our approach is to develop a learning system that has a machine learning model to predict learners' confidence. To our knowledge, there is no dataset of L2 utterance and self-confidence to achieve this. Therefore, we conducted an experiment to collect these data from 14 international students in an online Japanese course at Kyushu University. This paper reports some findings from a preliminary analysis of the collected data. Our developed prototype system for L2 speaking collected approximately 4,500 data which consist of utterance audio and 4-point confidence labels. Various labeling patterns suggest that a predicting model should be flexible for each learner's confidence. These results provide us with a direction to create the model, such as an investigation of some variables that could affect the confidence labels.