(ABSTRACT)With the rise of big data, it is becoming increasingly important to educate students about data analytics. In particular, students without a strong mathematical background usually have an unenthusiastic attitude towards high-dimensional data and find it challenging to understand relevant complex analytical methods, such as dimension reduction. In this thesis, we present an embodied approach for visual analytics designed to teach students exploring alternative 2D projections of high dimensional data points using weighted multidimensional scaling. We proposed a novel concept, Be the Data, and its application to explore the possibilities of using human's embodied resources to learn from high dimensional data. In our system, each student embodies a data point and the position of students in a physical space represents a 2D projection of the high-dimensional data.Students physically moves in a room with respect to others to interact with alternative projections and receive visual feedback. We conducted educational workshops with students inexperienced in relevant data analytical methods. Our findings indicate that the students were able to learn about high-dimensional data and data analysis process despite their low level of knowledge about the complex analytical methods. Similarly, we applied Be the Data to social meetings. We used the same techniques to analyze and display social-cluster related information to facilitate social interactions in real time.
Participants information 323.2 A quantitative summary of students' understanding about the key concepts (i.e., variable, relative distance, dimension reduction, data exploration), interests, and confidence towards learning high-dimensional data before and after the workshop. DR stands for dimension reduction. Column 3 and 4 are observed proportion of correct answers for the pre and post surveys. Column 5 and 6 are the expected difference and the credible interval for the difference in proportions. Column 7 are the p-values from a two-tailed two sample t-test. The * in column 6 and 7 flags questions when there is important difference in pre and post. 373.3 A summary of students' embodied interaction during the activity 48 4.1 A portion of the high-dimensional dataset that describes participants in one social meeting.