Research on Deep Learning Techniques in Breaking Text-Based Captchas and Designing Image-Based Captcha

Tang, Mengyun; Gao, Haichang; Zhang, Yang; Liu, Yi; Zhang, Ping; Wang, Ping

doi:10.1109/tifs.2018.2821096

Cited by 85 publications

(51 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The results of our approach compared to those of the other three works [3], [17], [18] are similar: most of their success rates are notably lower than ours when attacking the same scheme. The comparisons are acceptable because all these works use the image-processing based method to attack the targeted schemes.…”

Section: A Comparison With Prior Worksupporting

confidence: 62%

“…However, their attack required clean individual characters, which also increased the difficulty. In 2018, [3] proposed a deep-learning-based three-step attack. To obtain an efficient model, they used 2,400 manually labeled samples from each scheme for training.…”

Section: Background a Prior Attacksmentioning

confidence: 99%

“…Moreover, most Chinese characters consist of integrated Chinese radicals, which also makes them more difficult to recognize than Roman characters [5]. In fact, some prior works have analyzed the significance of and worked partly with Chinese schemes [3], [5], [22]. To investigate the potential of our attack, we also evaluated our transferlearning-based attack on five Chinese CAPTCHA schemes deployed by five well-known Chinese commercial websites.…”

Section: ) Chinese Schemesmentioning

confidence: 99%

“…In contrast, our approach uses only irrelevant TABLE 5. Comparison between our approach and five prior works [3], [9], [17], [18], [24]. RI = Results by imitated data; RR = Results by randomly generated data.…”

Section: A Comparison With Prior Workmentioning

confidence: 99%

See 3 more Smart Citations

Simple and Easy: Transfer Learning-Based Attacks to Text CAPTCHA

et al. 2020

Self Cite

View full text Add to dashboard Cite

CAPTCHA, or Completely Automated Public Turing Tests to Tell Computers and Humans Apart, is a common mechanism used to protect commercial accounts from malicious computer bots, and the most widely used scheme is text-based CAPTCHA. In recent years, newly emerged deep learning techniques have achieved high accuracy and speed in attacking text-based CAPTCHAs. However, most of the existing attacks have various disadvantages, the attack process made high complexity or manually collecting and labeling a large number of samples to train a deep learning recognition model is time-consuming and expensive. In this paper, we propose a transfer learning-based approach that greatly reduces the attack complexity and the cost of labeling samples, specifically, by pre-training the model with randomly generated samples and fine-tuning the pre-trained model with a small number of real-world samples. To evaluate our attack, we tested 25 online CAPTCHAs achieving success rates ranging from 36.3% to 96.9%. To further explore the effect of the training sample characteristics on the attack accuracy, we elaborately imitate some samples and apply a generative adversarial network to refine the samples, sequentially we use these two kinds of generated samples to pre-train the models, respectively. The experimental results demonstrate that the similarity between randomly generated samples and elaborately imitated samples has a negligible impact on the attack accuracy. Instead, transfer learning is the key factor; it reduces the cost of data preparation while preserving the model's attack accuracy. INDEX TERMS CAPTCHA, security, deep learning, transfer learning. I. INTRODUCTION

show abstract

Section: A Comparison With Prior Worksupporting

confidence: 62%

Section: Background a Prior Attacksmentioning

confidence: 99%

Section: ) Chinese Schemesmentioning

confidence: 99%

Section: A Comparison With Prior Workmentioning

confidence: 99%

See 2 more Smart Citations

Simple and Easy: Transfer Learning-Based Attacks to Text CAPTCHA

et al. 2020

Self Cite

View full text Add to dashboard Cite

show abstract

“…Sejumlah besar skema teks singkat CAPTCHA yang diterapkan dapat dibongkar, seperti Google, Yahoo!, dan Microsoft. Desainer CAPTCHA biasanya belajar dari kegagalan sebelumnya untuk mendesain CAPTCHA dengan peningkatan keamanan dan kegunaan [4].…”

Section: Pendahuluanunclassified

Analisis Robustness Teks Captcha Paypal HIP Menggunakan Template Matching

Humaira

Musri²,

Sarimuddin

et al. 2018

JAIC Polibatam

View full text Add to dashboard Cite

CAPTCHA refer to Completely Automated Public Turing test to tell Computers and Humans Apart. CAPTCHA are used to ensure that the operators are human not robots. The basic idea of using CAPTCHA is segmentation and recognition. Random characters, graphic images, or CAPTCHA audio become possible solutions to improve security and resilience for protection systems. In this paper used CAPTCHA random characters. However the CAPTCHA text needs to be analyzed again whether it is still solved by the computer or not it needs to be analyzed, improved, and developed to avoid automatic interference. Data set of text CAPTCHA paypal or so-called paypal HIP with 20 pieces of training data to get the template as much as 36 images that is from the numbers 0-9 and the letter A-Z. This particular paypal HIP data is limited by not using numbers 0 and 1 with the letters O and Q because of the similarity between the data. The method used starts from pre-processing, segmentation, and classification. Pre-processing techniques used consist of removing noise by tresholding and using cleaning techniques. We use bounding box and padding for segmentation method. And then for classification used counting pixel, vertical projections, horizontal projections, dan template correlation. By using these methods will be known which method can recognize CAPTCHA text accurately so as to affect the robustness of the CAPTCHA text.

show abstract

An ensemble method for feature selection and an integrated approach for mitigation of distributed denial of service attacks

Chanu

Singh

Chanu

2022

Concurrency and Computation

View full text Add to dashboard Cite

Distributed denial of service attacks (DDoS) penetrate numerous computer system and implant malicious codes thereby making them ready for launching a collaborative attack. These attacks paralyze the target system mainly the web server by exhausting their network resources of the target server. The threats posed by DDoS attacks on the Internet demands for effective detection and mitigation methods of these attacks.In the paper, we proposed an integrated method for detection and mitigation of DDoS attack using machine learning and a line of defenses respectively. The detection phase consists of feature selection through ensemble feature selection algorithm and classification using machine learning algorithm. Feature selection algorithms are important as they reduce the dimension of the dataset. The selection of an efficient classification model will improve the detection rate of the proposed system. In the mitigation phase, we introduce two lines of defense to minimize the exhaustion of the victim server's resources. Using the existing dataset, we show experimentally that it is possible to detect the presence of attacks and mitigate them to a minimum level. The proposed integrated method yields an accuracy of 97.8% in detecting the attacks and able to reduce the utilization of processors upto an average of 25.95%.

show abstract

Research on Deep Learning Techniques in Breaking Text-Based Captchas and Designing Image-Based Captcha

Cited by 85 publications

References 38 publications

Simple and Easy: Transfer Learning-Based Attacks to Text CAPTCHA

Simple and Easy: Transfer Learning-Based Attacks to Text CAPTCHA

Analisis Robustness Teks Captcha Paypal HIP Menggunakan Template Matching

An ensemble method for feature selection and an integrated approach for mitigation of distributed denial of service attacks

Contact Info

Product

Resources

About