Javier Alejandro Varela scite author profile

In recent years, $$\hbox {optical character recognition (OCR)}$$ optical character recognition (OCR) systems have been used to digitally preserve historical archives. To transcribe historical archives into a machine-readable form, first, the documents are scanned, then an $$\hbox {OCR}$$ OCR is applied. In order to digitize documents without the need to remove them from where they are archived, it is valuable to have a portable device that combines scanning and $$\hbox {OCR}$$ OCR capabilities. Nowadays, there exist many commercial and open-source document digitization techniques, which are optimized for contemporary documents. However, they fail to give sufficient text recognition accuracy for transcribing historical documents due to the severe quality degradation of such documents. On the contrary, the anyOCR system, which is designed to mainly digitize historical documents, provides high accuracy. However, this comes at a cost of high computational complexity resulting in long runtime and high power consumption. To tackle these challenges, we propose a low power energy-efficient accelerator with real-time capabilities called iDocChip, which is a configurable hybrid hardware-software programmable $$\hbox {System-on-Chip (SoC)}$$ System-on-Chip (SoC) based on anyOCR for digitizing historical documents. In this paper, we focus on one of the most crucial processing steps in the anyOCR system: Text and Image Segmentation, which makes use of a multi-resolution morphology-based algorithm. Moreover, an optimized $$\hbox {FPGA}$$ FPGA -based hybrid architecture of this anyOCR step along with its optimized software implementations are presented. We demonstrate our results on multiple embedded and general-purpose platforms with respect to runtime and power consumption. The resulting hardware accelerator outperforms the existing anyOCR by 6.2$$\times$$ × , while achieving 207$$\times$$ × higher energy-efficiency and maintaining its high accuracy.

show abstract

Optimization strategies for portable code for Monte Carlo-based value-at-risk systems

Varela

Kestel

Schryver

et al. 2015

View full text Add to dashboard Cite

Value-at-risk (VaR) computations are one important basic element of risk analysis and management applications. On the one hand, risk management systems need to be flexible and maintainable, but on the other hand they require a very high computational power. In general, accelerators provide high speedups, but come with a limited flexibility. In this work, we investigate two approaches towards portable and fast code for VaR computations on heterogeneous platforms: operator tuning and the use of OpenCL. We show that operator tuning can save up one third of run time on CPU-based systems in the calibration step. For OpenCL, we present a detailed analysis of run time on CPU, GPU, and Xeon Phi, and evaluate its portability. We also find that the same code runs up to 12x faster in a VaR setting with an accelerator card being present, without any code changes required

show abstract

Reverse Longstaff-Schwartz American Option Pricing on hybrid CPU/FPGA Systems

Brugger¹,

Varela²,

Wehn³

et al. 2015

View full text Add to dashboard Cite

Nested MC-Based Risk Measurement of Complex Portfolios: Acceleration and Energy Efficiency

Desmettre

Korn

Varela

et al. 2016

Risks

View full text Add to dashboard Cite

Risk analysis and management currently have a strong presence in financial institutions, where high performance and energy efficiency are key requirements for acceleration systems, especially when it comes to intraday analysis. In this regard, we approach the estimation of the widely-employed portfolio risk metrics value-at-risk (VaR) and conditional value-at-risk (cVaR) by means of nested Monte Carlo (MC) simulations. We do so by combining theory and software/hardware implementation. This allows us for the first time to investigate their performance on heterogeneous compute systems and across different compute platforms, namely central processing unit (CPU), many integrated core (MIC) architecture XeonPhi, graphics processing unit (GPU), and field-programmable gate array (FPGA). To this end, the OpenCL framework is employed to generate portable code, and the size of the simulations is scaled in order to evaluate variations in performance. Furthermore, we assess different parallelization schemes, and the targeted platforms are evaluated and compared in terms of runtime and energy efficiency. Our implementation also allowed us to derive a new algorithmic optimization regarding the generation of the required random number sequences. Moreover, we provide specific guidelines on how to properly handle these sequences in portable code, and on how to efficiently implement nested MC-based VaR and cVaR simulations on heterogeneous compute systems.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.