Abstract. Our critical computing systems are coded in low-level, typeunsafe languages such as C, and it is unlikely that they will be re-coded in a high-level, type-safe language such as Java. This invited talk discusses some approaches that show promise in achieving type safety for legacy C code.
MotivationOur society is increasingly dependent upon its computing and communications infrastructure. That infrastructure includes the operating systems, device drivers, libraries and applications that we use on our desktops, as well as the file servers, databases, web servers, and switches that we use to store and communicate data. Today, that infrastructure is built using unsafe, error-prone languages such as C or C++ where buffer overruns, format string errors, and space leaks are not only possible, but frighteningly common.In contrast, type-safe languages, such as Java, Scheme, and ML, ensure that such errors either cannot happen (through static type-checking and automatic memory management) or are at least caught at the point of failure (through dynamic type and bound checks.) This fail-stop guarantee is not a total solution, but it does isolate the effects of failures, facilitates testing and determination of the true source of failures, and enables tools and methodologies for achieving greater levels of assurance. Therefore, the obvious question is:Why don't we re-code our infrastructure using type-safe languages?Though such a technical solution looks good on paper and is ultimately the "right thing", there are a number of economic and practical issues that prevent it from happening. First, our infrastructure is large. Today's operating systems consist of tens of millions of lines of code. Throwing away all of that C code and reimplementing it in, say Java, is simply too expensive, just as throwing out old Cobol code was too difficult for Year 2000 bugs.Second, though C and C++ have many faults, they also have some virtuesespecially when it comes to building the low-level pieces of infrastructure. In particular, C provides a great deal of transparency and control over data representations which is precisely what is needed to build services such as memorymapped device drivers, page-table routines, communication buffer management, real-time schedulers, and garbage collectors. It is difficult if not impossible to realize these services in today's type-safe languages simply because they force one