Designing interfaces for text-to-image prompt engineering using stable diffusion models: a human-AI interaction approach
Seonuk Kim
TLDR
This paper proposes future design considerations to develop more intuitive and effective interfaces that can be used for text-to-image prompt engineering from a human-AI interaction perspective using a data-driven approach.
Abstract
The use of generative artificial intelligence (AI) is more vital ever than before for creating new content, especially images. Recent breakthroughs in text-to-image diffusion models have shown the potential to drastically change the way we approach image content creation. However, artists still face challenges when attempting to create images that reflect their specific themes and formats, as the current generative systems, such as Stable Diffusion models, require the right prompts to achieve the desired artistic outputs. In this paper, we propose future design considerations to develop more intuitive and effective interfaces that can be used for text-to-image prompt engineering from a human-AI interaction perspective using a data-driven approach. We collected 78,911 posts from the internet community and analyzed them through thematic analysis. Our proposed directions for interface design can help improve the user experience as well as usability, ultimately leading to a more effective and desired image generation process for creators.
