Various vulnerabilities have been found in message parsers of protocol implementations in the past. Even highly sensitive software components like TLS libraries are affected regularly. Resulting issues range from denial-of-service attacks to the extraction of sensitive information. The complexity of protocols and imprecise specifications in natural language are the core reasons for subtle bugs in implementations, which are hard to find. The lack of precise specifications impedes formal verification. In this paper, we propose a model and a corresponding domain-specific language to formally specify message formats of existing real-world binary protocols. A unique feature of the model is the capability to define invariants, which specify relations and dependencies between message fields. Furthermore, the model allows defining the relation of messages between different protocol layers and thus ensures correct interpretation of payload data. We present a technique to derive verifiable parsers based on the model, generate efficient code for their implementation, and automatically prove the absence of runtime errors. Examples of parser specifications for Ethernet and TLS demonstrate the applicability of our approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.