Static binary translation (SBT) systems and dynamic binary translation (DBT) systems have their own merits and disadvantages. SBT can perform whole-program optimizations and do not incur run-time overheads. However, the code discovery and the code location problems caused by indirect branches make SBT systems hard to develop. On the other hand, DBT can perform optimizations based on program's runtime behaviors and can handle indirect branches easily. However, because the translation time accounts for a part of the execution time, DBT systems cannot perform aggressive optimizations. Therefore, quality of the code generated by DBT is not as good as that by SBT. In this paper, we present a hybrid binary translation (HBT) system which combines the merits of both SBT and DBT. It leverages the LLVM infrastructure to translate source binary code, optimize, and generate target binary code. It first translates binary statically. If a run-time exception happens, the HBT system switches to dynamic translation. On the EEMBC benchmark suite, our experimental result shows that the HBT system can run about 4 to 20 times faster than a LLVM-based DBT system.
Machines designed with new but incompatible Instruction Set Architecture (ISA) may lack proper applications. Binary translation can address this incompatibility by migrating applications from one legacy ISA to a new one, although binary translation has problems such as code discovery for variable-length ISA and code location issues for handling indirect branches. Dynamic Binary Translation (DBT) has been widely adopted for migrating applications since it avoids those problems. Static Binary Translation (SBT) is a less general solution and has not been actively researched. However, SBT performs more aggressive optimizations, which could yield more compact code and better code quality. Applications translated by SBT can consume less memory, processor cycles, and power than DBT and can be started more quickly. These advantages are even more critical for embedded systems than for general systems. In this article, we designed and implemented a new SBT tool, called LLBT, which translates ARM instructions into LLVM IRs and then retargets the LLVM IRs to various ISAs, including ×86, ×86-64, ARM, and MIPS. LLBT leverages two important functionalities from LLVM: comprehensive optimizations and retargetability. More importantly, LLBT solves the code discovery problem for ARM/Thumb binaries without resorting to interpretation. LLBT also effectively reduced the size of the address mapping table, making SBT a viable solution for embedded systems. Our experiments based on the EEMBC benchmark suite show that the LLBT-generated code can run more than 6× and 2.3× faster on average than emulation with QEMU and HQEMU, respectively.
Code discovery has been a main challenge for static binary translation, especially when the source ISA (Instruction Set Architecture) has variable-length instructions, such as the X86 architectures. Due to embedded data such as PC-relative data, jump tables, or paddings in the code section, a binary translator may be misled to translate data as instructions. With variable length instructions, once data is mis-translated as instructions, subsequent decoding of instructions could be wrong. This paper concerns static binary translation for the ARM architectures, which dominate the embedded-system market. Although ARM is considered RISC (Reduced Instruction Set Computing) in many aspects of processors, it does allow the mix of 32-bit instructions (ARM) with 16-bit instructions (Thumb) in the ARM/Thumb mixed executables. Since the instruction lengths of ARM and Thumb are not equal, the locations of the instructions could be 4-byte or 2-byte aligned addresses, respectively. Furthermore, because ARM and Thumb instructions share encoding space, a 4-byte word could be decoded as one ARM instruction or two Thumb instructions. The correct decoding of this 4-byte word is actually determined at run time by the least significant bit of the program counter.For unstripped binaries, mapping symbols can be used to identify ARM code regions and Thumb code regions. However, for stripped binaries, such mapping symbols are not available to assist translation. We have proposed a novel solution to statically translate the stripped executables for the ARM/Thumb mixed ISA. Our static binary translator includes a translation pass which guarantees the correctness of the translated executable by generating multiple versions of translated code for runtime selection. The binary translator also includes a series of optimization analyses which discover and remove most of the code generated in the baseline translation. Based on the SPEC2006 benchmark suite, stripped ARM/Thumb mixed binaries translated by our static binary translator achieve good performance with only 25% of code size increase.
Lack of applications has always been a serious concern for designing machines with a new but incompatible ISA. To address this concern, binary translation is one common technique to migrate applications from one legacy ISA to new ones. In the past, dynamic binary translation (DBT) has been more widely adopted for migrating applications since it avoids some challenging problems for binary translation such as code discovery for variable length ISA and code location issues for handling indirect branches. Static binary translation (SBT) is usually regarded as a less general solution and has not been actively researched on. However, SBT has advantages of performing more aggressive optimizations, which could yield more compact code and greater code quality. In general, SBT translated applications are likely to consume less memory, processor cycles and power, and can be started more quickly. All the above advantages are more critical for embedded systems than for general systems. Therefore, we believe that even though SBT is not as general as DBT, it has a unique role to play for migrating applications in embedded systems.In this paper, we designed and implemented a new portable SBT tool, called LLBT, which translates source binary into LLVM IR and then retargets the LLVM IR to various ISAs by using the LLVM compiler infrastructure. Using the LLVM compiler infrastructure, LLBT successfully leverages two important functionalities from LLVM: the comprehensive optimizations and the retargetability. For example, most DBTs map guest architecture states into the host registers to minimize accessing guest architecture states with memory operations, but must deal with guest architecture state saving/reloading at trace/block entry/exit points. LLBT can treat the complete application binary as a single function and uses the global register allocation optimization in LLVM to consistently map guest architecture states in host registers so as to avoid the costly state saving and reloading at trace/block exits.In this paper, we have shown our ARM-based LLBT can effectively migrate EEMBC benchmark Suite from ARMv5 to Intel IA32, Intel x64, MIPS, and other ARMs such as ARMv7. On the Intel i7 based host systems, the LLBT generated code can run 3 to 64 times faster than emulating with QEMU, which uses the DBT technique.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.