John Peter Campora scite author profile

Erwig

et al. 2017

Gradual typing allows programs to enjoy the benefits of both static typing and dynamic typing. While it is often desirable to migrate a program from more dynamically-typed to more statically-typed or vice versa, gradual typing itself does not provide a way to facilitate this migration. This places the burden on programmers who have to manually add or remove type annotations. Besides the general challenge of adding type annotations to dynamically typed code, there are subtle interactions between these annotations in gradually typed code that exacerbate the situation. For example, to migrate a program to be as static as possible, in general, all possible combinations of adding or removing type annotations from parameters must be tried out and compared.In this paper, we address this problem by developing migrational typing, which efficiently types all possible ways of adding or removing type annotations from a gradually typed program. The typing result supports automatically migrating a program to be as static as possible, or introducing the least number of dynamic types necessary to remove a type error. The approach can be extended to support user-defined criteria about which annotations to modify. We have implemented migrational typing and evaluated it on large programs. The results show that migrational typing scales linearly with the size of the program and takes only 2ś4 times longer than plain gradual typing.

Casts and costs: harmonizing safety and performance in gradual typing

Walkingshaw

2018

Gradual typing allows programmers to use both static and dynamic typing in a single program. However, a well-known problem with sound gradual typing is that the interactions between static and dynamic code can cause significant performance degradation. These performance pitfalls are hard to predict and resolve, and discourage users from using gradual typing features. For example, when migrating to a more statically typed program, often adding a type annotation will trigger a slowdown that can be resolved by adding more annotations elsewhere, but since it is not clear where the additional annotations must be added, the easier solution is to simply remove the annotation. To address these problems, we develop: (1) a static cost semantics that accurately predicts the overhead of static-dynamic interactions in a gradually typed program, (2) a technique for efficiently inferring such costs for all combinations of inferrable type assignments in a program, and (3) a method for translating the results of this analysis into specific recommendations and explanations that can help programmers understand, debug, and optimize the performance of gradually typed programs. We have implemented our approach in Herder, a tool for statically analyzing the performance of different typing configurations for Reticulated Python programs. An evaluation on 15 Python programs shows that Herder can use this analysis to accurately and efficiently recommend type assignments that optimize the performance of these programs without sacrificing the safety guarantees provided by static typing.

Learning user friendly type-error messages

2017

Type inference is convenient by allowing programmers to elide type annotations, but this comes at the cost of often generating very confusing and opaque type error messages that are of little help to fix type errors. Though there have been many successful attempts at making type error messages better in the past thirty years, many classes of errors are still difficult to fix. In particular, current approaches still generate imprecise and uninformative error messages for type errors arising from errors in grouping constructs like parentheses and brackets. Worse, a recent study shows that these errors usually take more than 10 steps to fix and occur quite frequently (around 45% to 60% of all type errors) in programs written by students learning functional programming. We call this class of errors, nonstructural errors. We solve this problem by developing Learnskell, a type error debugger that uses machine learning to help diagnose and deliver high quality error messages, for programs that contain nonstructural errors. While previous approaches usually report type errors on typing constraints or on the type level, Learnskell generates suggestions on the expression level. We have performed an evaluation on more than 1,500 type errors, and the result shows that Learnskell is quite precise. It can correctly capture 86% of all nonstructural errors and locate the error cause with a precision of 63%/87% with the first 1/3 messages, respectively. This is several times more than the precision of state-of-the-art compilers and debuggers. We have also studied the performance of Learnskell and found out that it scales to large programs. CCS Concepts: • Software and its engineering → Functional languages; Data types and structures;

Taming type annotations in gradual typing

2020

Gradual typing provides a methodology to integrate static and dynamic typing, harmonizing their often conflicting advantages in a single language. When a user wants to enjoy the advantages of static typing, most gradual languages require that they add type annotations. Many nontrivial tasks must be undertaken while adding type annotations, including understanding program behaviors and invariants. Unfortunately, if this is done incorrectly then the added type annotations can be wrong–leading to inconsistencies between the program and the type annotations. Gradual typing implementations detect such inconsistencies at runtime, raise cast errors, and generate messages. However, solely relying on such error messages for understanding and fixing inconsistencies and their resulting cast errors is often insufficient for multiple reasons. One reason is that while such messages cover inconsistencies in one execution path, fixing them often requires reconciling information from multiple paths. Another is that users may add many wrong type annotations that they later find difficult to identify and fix, when considering all added annotations. Recent studies provide evidence that type annotations added during program migration are often wrong and that many programmers prefer compile-time warnings about wrong annotations. Motivated by these results, we develop exploratory typing to help with the static detection, understanding, and fixing of inconsistencies. The key idea of exploratory typing is that it systematically removes dynamic types and explores alternative types for static type annotations that can remedy inconsistencies. To demonstrate the feasibility of exploratory typing, we have implemented it in PyHound, which targets programs written in Reticulated Python, a gradual variant of Python. We have evaluated PyHound on a set of Python programs, and the evaluation results demonstrate that our idea can effectively detect inconsistencies in 98% of the tested programs and fix 93% of inconsistencies, significantly outperforming pytype, a widely used Python tool for enforcing type annotations.

Generating precise error specifications for C: a zero shot learning approach

et al. 2019

In C programs, error specifications, which specify the value range that each function returns to indicate failures, are widely used to check and propagate errors for the sake of reliability and security. Various kinds of C analyzers employ error specifications for different purposes, e.g., to detect error handling bugs, yet a general approach for generating precise specifications is still missing. This limits the applicability of those tools. In this paper, we solve this problem by developing a machine learning-based approach named MLPEx. It generates error specifications by analyzing only the source code, and is thus general. We propose a novel machine learning paradigm based on transfer learning, enabling MLPEx to require only one-time minimal data labeling from us (as the tool developers) and zero manual labeling efforts from users. To improve the accuracy of generated error specifications, MLPEx extracts and exploits project-specific information. We evaluate MLPEx on 10 projects, including 6 libraries and 4 applications. An investigation of 3,443 functions and 17,750 paths reveals that MLPEx generates error specifications with a precision of 91% and a recall of 94%, significantly higher than those of state-of-the-art approaches. To further demonstrate the usefulness of the generated error specifications, we use them to detect 57 bugs in 5 tested projects. CCS Concepts: • Software and its engineering → Error handling and recovery.