This Digitalization of documents is now being done in all fields to reduce paper usage. The availability of modern technology in the form of scanners and cameras supports the growth of multimedia data, especially documents stored in the form of image files. Searching a particular text in a large-scale scanned document images is a difficult task if the document is in the form of images where the text has not been extracted. In this research, text extraction method of large-scale scanned document images using Google Vision OCR on the Hadoop architecture is proposed. The object of research is student thesis documents, which includes the cover page, the approval page, and abstract. All documents are stored in the university's digital library. Extraction process begins with preparing the input folder that contains image documents (in JPEG format) in HDFS Apache Hadoop and followed by reading the image document. The image document is then extracted using Google Vision OCR in order to obtain text document (in TXT format) and the result is saved to output folder in Hadoop Distributed File System (HDFS). The same process is repeated for the entire documents in the folder. Test results have shown that the proposed methods were able to extract all test documents successfully. The recognition process achieved 100% accuracy and the extraction time is twice as fast as manual extraction. Google Vision OCR also shows better extraction performance compared to other OCR tools. The proposed automated extraction systems can recognize text in a large-scale image document accurately and can be operated in a real-time environment.
The development of information and communication technology has a significant influence on the development of learning media and method. E-learning is a new way of teaching and learning using information and communication technology as a learning system. Interactive multimedia which is more popular as multimedia instructional, is one kind of the learning media that can be developed and implemented through e-learning system. This study aims to analyze the strategies in developing e-learning based multimedia instructional at the Islamic boarding university. This study has used the qualitative method for data collection and analysis. Data collection was carried out through observation and interviews with several stakeholders and lecturers. They were inquired about strategies in developing e-learning based multimedia instructional at Islamic boarding university. The results showed that the strategy in developing e-learning based multimedia instructional at the islamic boarding university was conducted through five processes. The five processes are composed of analysis, design, development, implementation, and evaluation. The contribution of this study is a strategy in developing elearning-based multimedia instructional creatively and innovatively at islamic boarding university.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.