AI falls short of human creativity in visual imagination tasks, study finds

A University of Barcelona study found generative AI cannot produce original visual ideas without human input. Left to work alone, it ranked last-well below untrained humans.

Categorized in: AI News Science and Research
Published on: Mar 29, 2026
AI falls short of human creativity in visual imagination tasks, study finds

AI Falls Short on Creative Tasks Without Human Guidance, Study Finds

A new study from the University of Barcelona found that generative AI systems struggle to produce original visual ideas on their own, and that performance improves only when humans provide concrete creative direction. Researchers published their findings in the journal Advanced Science.

The team tested whether AI could generate genuinely original visual concepts-not just convincing imitations of existing styles. The answer was no.

How the Research Worked

Scientists presented abstract shapes to two groups of humans: 27 visual artists and 26 non-artists. Both groups were asked to imagine an image based on the shape, describe it, and draw it. A Stable Diffusion image-generation model completed the same task under two conditions: one with a real human idea embedded in the prompt ("human-guided"), and one with only a basic prompt ("self-guided").

A panel of 255 human raters evaluated 1,000 images across five dimensions: liking, vividness, originality, aesthetics, and curiosity.

The hierarchy was consistent. Visual artists ranked highest. Non-artists came second. Human-guided AI scored roughly level with non-expert humans. Self-guided AI came last by a significant margin.

Xim Cerdá-Company, a researcher at IDIBELL and the CVC-UAB and co-leader of the study, said: "Although the AI model was trained with the creative productions of human participants, it showed a poor performance in the production of creative images. In fact, it did even worse when it was deprived of human assistance."

Why Visual Tasks Reveal What Verbal Tests Miss

Most prior research on AI creativity has focused on language tasks-asking systems to generate unusual uses for objects or complete open-ended prompts. Those tests reward speed, volume, and the ability to combine distant concepts, areas where AI has a computational edge. The conclusion from that body of work has been that AI is already creative.

Abstract shapes carry no semantic content. They suggest no obvious interpretation. A human seeing a wavy line might connect it to a coastline, a memory, a feeling, or a half-remembered dream. The AI, given the same shape and minimal instruction, had almost nothing to work with.

Antoni Rodríguez-Fornells, co-leader of the study and head of the Cognition and Brain Plasticity research group at the University of Barcelona, said: "Current generative AI models are still far from replicating independent creative processes."

The improvement from adding human input was substantial. When researchers embedded a single concrete idea from a human participant into the prompt, the model's output jumped to the level of an untrained person. But that improvement came from the human idea, not from the model's own capability.

Can AI Judge Creativity?

The study also tested how well AI could evaluate creativity. Researchers had GPT-4o rate the same images under two conditions: one mirroring the human task, and one that included reference examples of human ratings.

Without reference points, GPT-4o rated AI-generated images at roughly the same level as human images. Human raters distinguished sharply between categories. When reference examples were added, GPT-4o's scores shifted closer to human patterns, but with notably wider variance.

The strongest agreement between human and AI raters occurred on perceptually direct measures: vividness, liking, and aesthetics. The weakest agreement emerged around originality and curiosity-qualities that depend on context, expectation, and cultural knowledge.

What This Means for Practitioners

The common assumption that AI can serve as an autonomous creative partner may be premature. The study demonstrates that AI performs well when given structured human input and poorly when that input is removed.

For designers, artists, marketers, or educators using these tools, the distinction matters. The quality of what AI produces is not a fixed property of the model. It scales directly with the quality and specificity of human involvement.

The model is not an engine of ideas. It is closer to a sophisticated executor of them.

Researchers acknowledged limitations. The study tested one class of image-generation model-Stable Diffusion-and not the newer multimodal systems currently generating the most public attention. Those models could not be tested under the same controlled conditions.

Related reading: Generative AI and LLM | AI Research


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)