OCRdroid: A Framework to Digitize Text Using Mobile Phones

Zhang, Mi; Joshi, Anant; Kadmawala, Ritesh; Dantu, Karthik; Poduri, Sameera; Sukhatme, Gaurav S.

doi:10.1007/978-3-642-12607-9_18

Cited by 14 publications

(7 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several OCR applications are available for Android and iOS smartphones, e.g. scanner-type applications or narrowly specialized applications, such as those that only allow business card processing [22], receipt management [23], or translation of visual text [24]. A considerable amount of research has focused on mobile OCR for blind users, e.g.…”

Section: B Mobile Ocrmentioning

confidence: 99%

CameraKeyboard: A Novel Interaction Technique for Text Entry Through Smartphone Cameras

Bellino

Herskovic

2019

IEEE Access

View full text Add to dashboard Cite

We present CameraKeyboard, a text entry technique that uses smartphone cameras to extract and digitalise text from physical sources such as business cards and identification documents. After taking a picture of the text of interest, the smartphone recognises the text through OCR technologies (specifically, Google Cloud Vision API) and organises it in the same way in which it is displayed in the original source. Next, users can select the required text -which is split into lines, and further split into words -by tapping on it. CameraKeyboard is designed to easily complement the standard keyboard when users need to digitalise text. In fact, the camera is integrated directly in the standard mobile QWERTY keyboard and can be accessed by tapping a button. CameraKeyboard is general-purpose like any other smartphone keyboard: it works with any application (e.g., Gmail, Whatsapp) and is able to extract text from any kind of source. We evaluated CameraKeyboard through a user study with 18 participants who carried out four different text entry tasks, and compared it to a standard smartphone keyboard (Google Keyboard) and a standard (physical) desktop keyboard. Results show that CameraKeyboard is the faster one in most cases.

show abstract

Section: B Mobile Ocrmentioning

confidence: 99%

CameraKeyboard: A Novel Interaction Technique for Text Entry Through Smartphone Cameras

Bellino

Herskovic

2019

IEEE Access

View full text Add to dashboard Cite

show abstract

“…OCRdroid [94] is a framework for developing OCR-based applications on mobile phones, which provides image processing based solutions for OCR problems in mobile phone captured images. Bad orientation, text misalignment, text skew and insufficient lighting are some of these OCR problems.…”

Section: Ocr Related Studiesmentioning

confidence: 99%

MobileCDP: A mobile framework for the consumer decision process

Özarslan

Eren

2015

Inf Syst Front

View full text Add to dashboard Cite

show abstract

“…Present-day sensing applications cover a broad range of areas including human interactions [27,59], context sensing [6,34,39,46,54], crowd sensing [55,58], object detection and tracking [7,29,42,53]. e recent commercial interest in IoT technologies promises a proliferation of smart objects in human spaces at a much broader scale.…”

Section: Introductionmentioning

confidence: 99%

DeepIoT

Yao

Zhao

Zhang

et al. 2017

Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems

157

View full text Add to dashboard Cite

Recent advances in deep learning motivate the use of deep neutral networks in sensing applications, but their excessive resource needs on constrained embedded devices remain an important impediment. A recently explored solution space lies in compressing (approximating or simplifying) deep neural networks in some manner before use on the device. We propose a new compression solution, called DeepIoT, that makes two key contributions in that space. First, unlike current solutions geared for compressing speci c types of neural networks, DeepIoT presents a uni ed approach that compresses all commonly used deep learning structures for sensing applications, including fully-connected, convolutional, and recurrent neural networks, as well as their combinations. Second, unlike solutions that either sparsify weight matrices or assume linear structure within weight matrices, DeepIoT compresses neural network structures into smaller dense matrices by nding the minimum number of non-redundant hidden elements, such as lters and dimensions required by each layer, while keeping the performance of sensing applications the same. Importantly, it does so using an approach that obtains a global view of parameter redundancies, which is shown to produce superior compression. e compressed model generated by DeepIoT can directly use existing deep learning libraries that run on embedded and mobile systems without further modi cations. We conduct experiments with ve di erent sensing-related tasks on Intel Edison devices. DeepIoT outperforms all compared baseline algorithms with respect to execution time and energy consumption by a signi cant margin. It reduces the size of deep neural networks by 90% to 98.9%. It is thus able to shorten execution time by 71.4% to 94.5%, and decrease energy consumption by 72.2% to 95.7%. ese improvements are achieved without loss of accuracy. e results underscore the potential of DeepIoT for advancing the exploitation of deep neural networks on resource-constrained embedded devices.

show abstract

OCRdroid: A Framework to Digitize Text Using Mobile Phones

Cited by 14 publications

References 13 publications

CameraKeyboard: A Novel Interaction Technique for Text Entry Through Smartphone Cameras

CameraKeyboard: A Novel Interaction Technique for Text Entry Through Smartphone Cameras

MobileCDP: A mobile framework for the consumer decision process

DeepIoT

Contact Info

Product

Resources

About