As the demand for recognizing irregular text in natural scenes increases, people are increasingly realizing the value of such applications, such as license plate recognition systems, image search, handwriting recognition, and autonomous driving, which are profoundly changing our lives in the field of text recognition. Recent studies have shown that the recognition of curved text and perspective text has become an important challenge in the field of text recognition, and the correction of curved text is a key step to achieve accurate recognition. However, current methods use strained text image correction methods, resulting in poor recognition accuracy when recognizing curved text. Therefore, we propose an end-to-end framework called Scene Text Recognizer with Appearance-Flow rectification (SterAF), which includes a correction network and a recognition network. Specifically, the framework’s steps are as follows: first, the input text image is deformed through an appearance flow-based correction network to adaptively warp the text image, to prevent irregular and unnatural deformations of the text image. Second, a sequence-to-sequence recognition network predicts the sequence of characters in the corrected text image to accurately recognize the text in the image. Through subjective and objective experiments, our SterAF model has shown excellent performance in both qualitative and quantitative experiments.