Proceedings of the 25th International Academic Mindtrek Conference 2022
DOI: 10.1145/3569219.3569352
|View full text |Cite
|
Sign up to set email alerts
|

The Creativity of Text-to-Image Generation

Abstract: Humankind is entering a novel creative era in which anybody can synthesize digital information using generative artificial intelligence (AI). Text-to-image generation, in particular, has become vastly popular and millions of practitioners produce AI-generated images and AI art online. This chapter first gives an overview of the key developments that enabled a healthy co-creative online ecosystem around text-to-image generation to rapidly emerge, followed by a high-level description of key elements in this ecos… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
43
0
6

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 133 publications
(49 citation statements)
references
References 60 publications
0
43
0
6
Order By: Relevance
“…In some respect, prompt engineering can be seen as a way to program a generative language model through natural language [53]. A popular way of guiding model output through prompt engineering is for text-to-image generative models [37,48]. This 'zero-shot' or 'few-shot' approach to tasks with generative language models has also achieved state-of-the-art results on several natural language tasks [8].…”
Section: Prompt Engineering For Transformer-based Generative Language...mentioning
confidence: 99%
“…In some respect, prompt engineering can be seen as a way to program a generative language model through natural language [53]. A popular way of guiding model output through prompt engineering is for text-to-image generative models [37,48]. This 'zero-shot' or 'few-shot' approach to tasks with generative language models has also achieved state-of-the-art results on several natural language tasks [8].…”
Section: Prompt Engineering For Transformer-based Generative Language...mentioning
confidence: 99%
“…We contend that affect conditioning is a valuable direction for future generative models, particularly in human-AI co-creative contexts. Prompt engineering is hard and time consuming, because many creators do not know exactly what they are looking for [9], and even if they did the model may not interpret their words as they intend [13]. We propose, however, that many artists and designers have a sense of the "vibe" they desire in their finished product, and affect conditioning gives a way for them to directly target that.…”
Section: Discussionmentioning
confidence: 97%
“…In creativity support applications such as art (where it may be desirable that an image evoke a particular mood or emotion [11]), or design (where diverse solutions must be explored [12]), this significantly damages the utility of existing generative models. An ethnographic study of an online community that sprung up around a popular text-to-image generator supports this: the intent-output mismatch was found to be a significant practical challenge, so much so that the time-intensive process of collaborative prompt engineering became a major focus for the community [13].…”
Section: Introductionmentioning
confidence: 90%
“…Concretely, we split each prompt by commas. The first part is regarded as the subject according to [27,33], while the rest are treated as prompt modifiers. We standardize the format of some modifiers, e.g., "8 k" to "8k," "3 d" to "3d."…”
Section: Data Collectionmentioning
confidence: 99%
“…Rather than relying on professional artists, text-to-image generation models empower anyone to produce digital images, such as photorealistic images and commercial drawings, by entering text descriptions called prompts. According to [18,27,33], a high-quality prompt that leads to a high-quality image should consist of a subject and several prompt modifiers. The subject is a natural language description of the image; the prompt modifiers are keywords or key phrases that are related to specific elements or styles of the image.…”
Section: Introductionmentioning
confidence: 99%