Bitcoin has been a popular research topic recently. Ethereum (ETH), a second generation of cryptocurrency, extends Bitcoin's design by offering a Turing-complete programming language called Solidity to develop smart contracts. Smart contracts allow creditable execution of contracts on EVM (Ethereum Virtual Machine) without third parties. Developing correct and secure smart contracts is challenging due to the decentralized computation nature of the blockchain. Buggy smart contracts may lead to huge financial loss. Furthermore, smart contracts are very hard, if not impossible, to patch once they are deployed. Thus, there is a recent surge of interest in analyzing and verifying smart contracts. While most of the existing works either focus on EVM bytecode or translate Solidity smart contracts into programs in intermediate languages, we argue that it is important and necessary to understand and formally define the semantics of Solidity since programmers write and reason about smart contracts at the level of source code. In this work, we develop a formal semantics for Solidity which provides a formal specification of smart contracts to define semantic-level security properties for the high-level verification. Furthermore, the proposed semantics defines correct and secure high-level execution behaviours of smart contracts to reason about compiler bugs and assist developers in writing secure smart contracts.
Regular expressions are a classical concept in formal language theory. Regular expressions in programming languages (RegEx) such as JavaScript, feature non-standard semantics of operators (e.g. greedy/lazy Kleene star), as well as additional features such as capturing groups and references. While symbolic execution of programs containing RegExes appeals to string solvers natively supporting important features of RegEx, such a string solver is hitherto missing. In this paper, we propose the first string theory and string solver that natively provides such support. The key idea of our string solver is to introduce a new automata model, called prioritized streaming string transducers (PSST), to formalize the semantics of RegEx-dependent string functions. PSSTs combine priorities, which have previously been introduced in prioritized finite-state automata to capture greedy/lazy semantics, with string variables as in streaming string transducers to model capturing groups. We validate the consistency of the formal semantics with the actual JavaScript semantics by extensive experiments. Furthermore, to solve the string constraints, we show that PSSTs enjoy nice closure and algorithmic properties, in particular, the regularity-preserving property (i.e., pre-images of regular constraints under PSSTs are regular), and introduce a sound sequent calculus that exploits these properties and performs propagation of regular constraints by means of taking post-images or pre-images. Although the satisfiability of the string constraint language is generally undecidable, we show that our approach is complete for the so-called straight-line fragment. We evaluate the performance of our string solver on over 195000 string constraints generated from an open-source RegEx library. The experimental results show the efficacy of our approach, drastically improving the existing methods (via symbolic execution) in both precision and efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.