Vocabulary learning has long been taught as a basis for L2 English language learning. The use of visual clues is a widely recognized method in vocabulary learning. Several research have applied automatic image caption generation, invented with computer vision and natural language processing, to English vocabulary learning. However, these vocabulary learning systems mainly use correct answers and their corresponding images. On the other hand, in English vocabulary learning, Fossilization is a problem where errors become difficult to correct and become established through repeated errors by learners. While the use of images has been effective in traditional English vocabulary learning, there has been little research focusing on learners' incorrect answers. In this research, we intentionally created situations where learners are likely to give incorrect answers and constructed an English vocabulary learning support system L-VEIGe (Learning-Vocabulary Error Image Generation) that generates images in response to learners' incorrect answers to promote effective introspection and eliminate repeated errors. An evaluation experiment targeting graduate students who are second language English learners revealed that a proposed method effectively prevents repetitive errors compared to a method without image generation.