In this paper, we present a content-aware method for generating a word painting. Word painting is a composite artwork made from the assemblage of words extracted from a given text, which carries similar semantics and visual features to a given source image. However, word painting, usually created by skilled artists, involves tedious manual processes, especially when generating streamlines and laying out text. Hence, we provide an easy method to create word paintings for users. How to design textural layout that simultaneously conveys the input image and enables easy access to the semantic theme, is the key challenge to generate a visually pleasing word painting. To address this issue, given an image and its content related text, we first decompose the input image into several regions and approximate each region with a smooth vector field. At the same time, by analyzing the input text, we extract some weighted keywords as the graphic elements. Then, to measure the likelihood of positions in the input image that attract the observers’ attention, we generate a saliency map with our trained visual attention model. Finally, jointly considering visual attention and aesthetic rules, we propose an energy-based optimization framework to arrange extracted keywords into the decomposed regions and synthesize a word painting. Experimental results and user studies show that this method is able to generate a fashionable and appealing word painting.