Background: CAPTCHA is a mechanism to distinguish humans from bots. It has become standard means of protection from the misuse of resources on World Wide Web. Different types of CAPTCHAs are implemented but text-based schemes are the most widely used due to its easiness and robustness. A user is asked to type in the text from an image. The image is intentionally distorted to dodge the bots. Recognizing the text is easy for humans but very hard for computers. Method/Findings: In this work, a text-based CAPTCHA scheme with background clutter and partially connected characters is decoded. The main steps consist on preprocessing, segmentation and recognition. Several digital image processing techniques were applied during preprocessing, segmentation steps and convolutional neural network (CNN) was used for recognition process. Since massive data is required for CNN therefore data was generated synthetically. A complex text-based CAPTCHA scheme with varying number of letters: 3, 4 and 5 letters is decoded with the overall precision of 77.5%, 64.2% and 51.9% respectively.
Bots are created to use the resources maliciously on World Wide Web. The misuse of the resources could be prevented by employing CAPTCHAs. Several types of CAPTCHAs are being used against the bots (robot) attacks but text-based CAPTCHA type is the most popular being very secured and easy to use. Latin language based text CAPTCHAs can be found ubiquitously on Internet but English text based CAPTCHAs are already decoded by many researchers. Thus, a novel Sindhi language based text CAPTCHA was proposed for regional websites where Arabic style script was utilized. This scheme offered two fold benefits: first, the proposed scheme could easily be understood by averagely literate person; second, this scheme paved a way for Arabic style OCR developers to understand Sindhi language specific features and facilitate Sindhi text recognition in future. A survey was also conducted to analyze the usability and strength of proposed CAPTCHA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.