Garo Bournoutian scite author profile

2008

Today, embedded processors are expected to be able to run complex, algorithm-heavy applications that were originally designed and coded for general-purpose processors. As a result, traditional methods for addressing performance and determinism become inadequate. This paper explores a new data cache design for use in modern high-performance embedded processors that will dynamically improve execution time, power efficiency, and determinism within the system. The simulation results show significant improvement in cache miss ratios and reduction in power consumption of approximately 30% and 15%, respectively.

On-device objective-C application optimization framework for high-performance mobile processors

Bournoutian¹,

Orailoğlu²

2014

Abstract-Smartphones provide applications that are increasingly similar to those of interactive desktop programs, providing rich graphics and animations. To simplify the creation of these interactive applications, mobile operating systems employ highlevel object-oriented programming languages and shared libraries to manipulate the device's peripherals and provide common userinterface frameworks. The presence of dynamic dispatch and polymorphism allows for robust and extensible application coding. Unfortunately, the presence of dynamic dispatch also introduces significant overheads during method calls, which directly impact execution time. Furthermore, since these applications rely heavily on shared libraries and helper routines, the quantity of these method calls is higher than those found in typical desktopbased programs. Optimizing these method calls centrally before consumers download the application onto a given phone is exacerbated due to the large diversity of hardware and operating system versions that the application could run on. This paper proposes a methodology to tailor a given Objective-C application and its associated device-specific shared library codebase using on-device post-compilation code optimization and transformation. In doing so, many polymorphic sites can be resolved statically, improving the overall application performance. I. INTRODUCTIONThe prevalence of mobile processors has grown significantly over the last few years. At the current rate, mobile processors are becoming increasingly ubiquitous throughout our society, resulting in a diverse range of applications that will be expected to run on these devices. State-of-the-art smartphones have evolved to the level of having feature-rich applications comparable to those of interactive desktop programs, providing high-quality visual and auditory experiences. These mobile processors are becoming increasingly complex in order to respond to this more diverse and demanding application base.From the application perspective, complexity is also growing within the mobile domain. A decade ago, mobile phone applications consisted of a few pre-bundled, hand-optimized programs specifically designed for a given device. Today, there is a vast quantity of applications available within mobile marketplaces; Apple's App Store, for example, had over 775,000 applications available for download at the start of 2013 [1]. Furthermore, mobile applications are now written in high-level, object-oriented programming languages such as Objective-C or Java. These applications tend to be highly interactive, providing visual, auditory, and haptic feedback, as well as leveraging inputs from numerous sensors such as touch, location, nearfield communication, and even eye tracking. To simplify the creation of these interactive applications, mobile operating systems provide foundation libraries to manipulate the device's peripherals and provide common user-interface frameworks.

Dynamic, multi-core cache coherence architecture for power-sensitive mobile processors

2011

Today, mobile smartphones are expected to be able to run the same complex, memory-intensive applications that were originally designed and coded for general-purpose processors. However, these mobile processors are also expected to be compact, ultra-portable, and provide an always-on, continuous data access paradigm necessitating a low-power design. As mobile processors increasingly begin to leverage multicore functionality, the power consumption incurred from maintaining coherence between local caches due to bus snooping becomes more prevalent. This paper explores a novel approach to mitigating multi-core processor power consumption in mobile smartphones. By using dynamic application memory behavior, one can intelligently target adjustments in the cache coherency protocol to help reduce the overhead of maintaining consistency when the benefits of multi-core shared cache coherence are muted. On the other hand, by utilizing a fine-grained approach, the proposed architecture can still respond to and enable the benefits of hardware cache coherence in situations where the performance improvements greatly outweigh the associated energy costs. The simulation results show appreciable reductions in overall cache power consumption, with negligible impact to overall execution time.

Dynamic transient fault detection and recovery for embedded processor datapaths

2012

As microprocessors continue to evolve and grow in functionality, the use of smaller nanometer technology scaling coupled with high clock frequencies and exponentially increasing transistor counts dramatically increases the susceptibility of transient faults. However, the correct and reliable operation of these processors is often compulsory, both in terms of consumer experience and for high-risk embedded domains such as medical and transportation systems. Thus, economical fault detection and recovery becomes essential to meet all necessary market requirements. This paper explores the efficient leveraging of superscalar, out-of-order architectures to enable multi-cycle transient fault-tolerance throughout the datapath in a novel manner. By using dynamic instruction execution redundancy, soft errors within the datapath are both detected and recovered. The proposed microarchitecture selectively reevaluates corrupted instructions, reducing the recovery impact by preserving completed instructions unaffected by the fault. The additional computational workload is dynamically staggered to leverage the out-of-order nature of the architecture and minimize resource conflicts and delays.

Application-aware adaptive cache architecture for power-sensitive mobile processors

ACM Trans. Embed. Comput. Syst.

2013

Today, mobile smartphones are expected to be able to run the same complex, algorithm-heavy, memory-intensive applications that were originally designed and coded for general-purpose processors. All the while, it is also expected that these mobile processors be power-conscientious as well as of minimal area impact. These devices pose unique usage demands of ultra-portability but also demand an always-on, continuous data access paradigm. As a result, this dichotomy of continuous execution versus long battery life poses a difficult challenge. This article explores a novel approach to mitigating mobile processor power consumption while abating any significant degradation in execution speed. The concept relies on efficiently leveraging both compile-time and runtime application memory behavior to intelligently target adjustments in the cache to significantly reduce overall processor power, taking into account both the dynamic and leakage power footprint of the cache subsystem. The simulation results show a significant reduction in power consumption of approximately 13% to 29%, while only incurring a nominal increase in execution time and area.