Usage of the Java Language by Novices over Time: Implications for Tool and Language Design

Objectives Java is a popular programming language for use in computing education, but it is difficult to get a wide picture of the issues that it presents for novices, and most studies look only at the types or frequency of errors. In this observational study we aim to learn how novices use different features of the Java language. Participants Users of the BlueJ development environment have been invited to opt-in to anonymously record their activity data for the past eight years. This dataset is called Blackbox, which was used as the basis for this study. BlueJ users are mostly novice programmers, predominantly male, with a median age of 16. Our data subset featured approximately 225,000 participants from around the world. Study Methods We performed a secondary data analysis that used data from the Blackbox dataset. We examined over 320,000 Java projects collected over the course of eight years, and used source code analysis to investigate the prevalence of various specifically-selected Java programming usage patterns. As this was an observational study without specific hypotheses, we did not use significance tests; instead we present the results themselves with commentary, having applied seasonal trend decomposition to the data. Findings We found many long-term trends in the data over the course of the eight years, most of which were monotonic. There was a notable reduction in the use of the main method (common in Java but unnecessary in BlueJ), and a general reduction in the complexity of the projects. We find that there are only a small number of frequently used types: int, String, double and boolean, but also a wide range of other infrequently used types. Conclusions We find that programming usage patterns gradually change over a long period of time (a period where the Java language was not seeing major changes), once seasonal patterns are accounted for. Any changes are likely driven by instructors and the changing demographics of programming novices. The novices use a relatively restricted subset of Java, which implies that designers of languages specifically targeted at novices can satisfy their needs with a smaller set of language constructs and features. We provide detailed recommendations for the designers of educational programming languages and supporting development tools.

show abstract

Novice Programmers' Unproductive Persistence: Using Learning Analytics to Interrogate Learning Theories

Smith

View full text Add to dashboard Cite

The purpose of this study is to analyze which behaviors are or are not helpful for debugging when a novice is in a state of unproductive persistence. Further, this project will exploratorily use a variety of analytical techniques -- including association rule mining, process mining, frequent sequence mining, and machine learning-- in order to determine which approaches are useful for data analysis. For the study, programming process data from hundreds of novice programmers were analyzed to determine which behaviors were more or less likely to be correlated with escaping a state of unproductive persistence. Of these events, only three had a statistically significant difference in their rates of occurrence and large effect sizes: file, edit, and compile events. While the data set cannot reveal a user's motivation for a file event, the most logical explanation of these events is that the user is tracing the code. Thus, a higher rate of file events suggests that code tracing (with the goal of code comprehension) is a key behavior correlated with a student's ability to escape a state of unproductive persistence. On the other hand, editing events are far more common in unproductive states that are not escaped. A content analysis suggests that there are more trivial edits for users in an unescaped state of unproductive persistence. An important finding of this study is that an unproductive persistence is not just a phenomenon of the worst-performing students; rather, a third of users who completed the assignment had at least one unproductive state. This study also lends support to the idea that tinkering combined with code tracing is correlated with positive outcomes, but that less systematic tinkering is not effective behavior. Further, association rule mining and frequent sequence mining were effective tools for data analysis in this study. The findings from this study have two main practical implications for curriculum designers and instructors: (1) the need to normalize struggle and (2) possibilities for curriculum and tool development. This work is particularly important given that debugging is not normally a process evident to instructors, curriculum designers, tool developers, and computer science education researchers, either because it happens outside of class time and/or because it is a process and these stakeholders usually only see the end result; this project attempts to make the process of debugging more transparent.

show abstract

Usage of the Java Language by Novices over Time: Implications for Tool and Language Design

Cited by 3 publications

References 18 publications

Computing Education Research in Schools

Computing Education Research in Schools

Novice Use of the Java Programming Language

Novice Programmers' Unproductive Persistence: Using Learning Analytics to Interrogate Learning Theories

Contact Info

Product

Resources

About