Bor-Yeh Shen scite author profile

You

Yang

et al. 2012

Static binary translation (SBT) systems and dynamic binary translation (DBT) systems have their own merits and disadvantages. SBT can perform whole-program optimizations and do not incur run-time overheads. However, the code discovery and the code location problems caused by indirect branches make SBT systems hard to develop. On the other hand, DBT can perform optimizations based on program's runtime behaviors and can handle indirect branches easily. However, because the translation time accounts for a part of the execution time, DBT systems cannot perform aggressive optimizations. Therefore, quality of the code generated by DBT is not as good as that by SBT. In this paper, we present a hybrid binary translation (HBT) system which combines the merits of both SBT and DBT. It leverages the LLVM infrastructure to translate source binary code, optimize, and generate target binary code. It first translates binary statically. If a run-time exception happens, the HBT system switches to dynamic translation. On the EEMBC benchmark suite, our experimental result shows that the HBT system can run about 4 to 20 times faster than a LLVM-based DBT system.

A Retargetable Static Binary Translator for the ARM Architecture

ACM Trans. Archit. Code Optim.

Hsu

Yang

2014

Machines designed with new but incompatible Instruction Set Architecture (ISA) may lack proper applications. Binary translation can address this incompatibility by migrating applications from one legacy ISA to a new one, although binary translation has problems such as code discovery for variable-length ISA and code location issues for handling indirect branches. Dynamic Binary Translation (DBT) has been widely adopted for migrating applications since it avoids those problems. Static Binary Translation (SBT) is a less general solution and has not been actively researched. However, SBT performs more aggressive optimizations, which could yield more compact code and better code quality. Applications translated by SBT can consume less memory, processor cycles, and power than DBT and can be started more quickly. These advantages are even more critical for embedded systems than for general systems. In this article, we designed and implemented a new SBT tool, called LLBT, which translates ARM instructions into LLVM IRs and then retargets the LLVM IRs to various ISAs, including ×86, ×86-64, ARM, and MIPS. LLBT leverages two important functionalities from LLVM: comprehensive optimizations and retargetability. More importantly, LLBT solves the code discovery problem for ARM/Thumb binaries without resorting to interpretation. LLBT also effectively reduced the size of the address mapping table, making SBT a viable solution for embedded systems. Our experiments based on the EEMBC benchmark suite show that the LLBT-generated code can run more than 6× and 2.3× faster on average than emulation with QEMU and HQEMU, respectively.

Effective code discovery for ARM/Thumb mixed ISA binaries in a static binary translator

Chen

et al. 2013

Code discovery has been a main challenge for static binary translation, especially when the source ISA (Instruction Set Architecture) has variable-length instructions, such as the X86 architectures. Due to embedded data such as PC-relative data, jump tables, or paddings in the code section, a binary translator may be misled to translate data as instructions. With variable length instructions, once data is mis-translated as instructions, subsequent decoding of instructions could be wrong. This paper concerns static binary translation for the ARM architectures, which dominate the embedded-system market. Although ARM is considered RISC (Reduced Instruction Set Computing) in many aspects of processors, it does allow the mix of 32-bit instructions (ARM) with 16-bit instructions (Thumb) in the ARM/Thumb mixed executables. Since the instruction lengths of ARM and Thumb are not equal, the locations of the instructions could be 4-byte or 2-byte aligned addresses, respectively. Furthermore, because ARM and Thumb instructions share encoding space, a 4-byte word could be decoded as one ARM instruction or two Thumb instructions. The correct decoding of this 4-byte word is actually determined at run time by the least significant bit of the program counter.For unstripped binaries, mapping symbols can be used to identify ARM code regions and Thumb code regions. However, for stripped binaries, such mapping symbols are not available to assist translation. We have proposed a novel solution to statically translate the stripped executables for the ARM/Thumb mixed ISA. Our static binary translator includes a translation pass which guarantees the correctness of the translated executable by generating multiple versions of translated code for runtime selection. The binary translator also includes a series of optimization analyses which discover and remove most of the code generated in the baseline translation. Based on the SPEC2006 benchmark suite, stripped ARM/Thumb mixed binaries translated by our static binary translator achieve good performance with only 25% of code size increase.

Automatic validation for binary translation

Chen

Yang

Computer Languages, Systems & Structures

et al. 2015

LLBT

Chen

Hsu

et al. 2012

Lack of applications has always been a serious concern for designing machines with a new but incompatible ISA. To address this concern, binary translation is one common technique to migrate applications from one legacy ISA to new ones. In the past, dynamic binary translation (DBT) has been more widely adopted for migrating applications since it avoids some challenging problems for binary translation such as code discovery for variable length ISA and code location issues for handling indirect branches. Static binary translation (SBT) is usually regarded as a less general solution and has not been actively researched on. However, SBT has advantages of performing more aggressive optimizations, which could yield more compact code and greater code quality. In general, SBT translated applications are likely to consume less memory, processor cycles and power, and can be started more quickly. All the above advantages are more critical for embedded systems than for general systems. Therefore, we believe that even though SBT is not as general as DBT, it has a unique role to play for migrating applications in embedded systems.In this paper, we designed and implemented a new portable SBT tool, called LLBT, which translates source binary into LLVM IR and then retargets the LLVM IR to various ISAs by using the LLVM compiler infrastructure. Using the LLVM compiler infrastructure, LLBT successfully leverages two important functionalities from LLVM: the comprehensive optimizations and the retargetability. For example, most DBTs map guest architecture states into the host registers to minimize accessing guest architecture states with memory operations, but must deal with guest architecture state saving/reloading at trace/block entry/exit points. LLBT can treat the complete application binary as a single function and uses the global register allocation optimization in LLVM to consistently map guest architecture states in host registers so as to avoid the costly state saving and reloading at trace/block exits.In this paper, we have shown our ARM-based LLBT can effectively migrate EEMBC benchmark Suite from ARMv5 to Intel IA32, Intel x64, MIPS, and other ARMs such as ARMv7. On the Intel i7 based host systems, the LLBT generated code can run 3 to 64 times faster than emulating with QEMU, which uses the DBT technique.