AI can beat average creativity - but the best ideas are still human
A massive study of 100,000 people compared today's top language models with human creativity. On several tests, models like GPT-4 outperformed the average participant. That's a shift worth paying attention to.
The ceiling is clear though. The most creative humans - especially the top 10% - outpaced every AI on richer tasks like poetry, stories, and plot development. Baseline ideation is getting automated. Taste, voice, and meaning remain your edge.
What the researchers tested
The team, led from Université de Montréal with contributors from Mila, Concordia, University of Toronto Mississauga, and Google DeepMind, used the Divergent Association Task (DAT), a common measure in AI Research. It asks for ten words that are as unrelated as possible - a fast way to gauge divergent thinking. Example: "galaxy, fork, freedom, algae, harmonica, quantum, nostalgia, velvet, hurricane, photosynthesis."
They didn't stop at word lists. The study also scored haiku, movie plot summaries, and short stories. Pattern held: some models beat average participants, but skilled human writers were consistently stronger and more original.
Why this matters for creatives
AI is now competent at first-pass ideation. You can get broader option sets, faster. That means you can spend more time on selection, synthesis, and concept - where the work actually stands out.
The upside: less staring at a blank page. The challenge: differentiating your taste from the same models everyone else uses.
Practical ways to use this today
- Kick off briefs with a DAT-style word list to widen your concept space. Then combine, remix, or discard.
- Use AI for volume: 20 hooks, 15 plot angles, 10 metaphors - then curate hard.
- Switch perspectives: ask for ideas from contrasting archetypes or opposing values to avoid sameness.
- Iterate constraints: shorter, weirder, stricter rules often produce fresher angles.
Dial AI creativity with settings and prompts
- Temperature: lower (~0.2-0.4) for safer drafts; higher (~0.9-1.2) for more surprising options.
- Etymology trick: ask the model to consider word origins and structure before generating options. This boosts unexpected associations.
- Prompt frame: "List 10 unrelated words, then explain the distance between each pair." The explanation step reduces shallow randomness.
- Multi-pass loop: generate → shortlist → combine → escalate constraints → refine voice.
Where humans still win
- Voice that feels lived-in, not generic.
- Conceptual leaps across domains that models rarely connect with intent.
- Emotional accuracy: what to leave unsaid, how to pace, when to break a rule.
Quality checks for better outputs
- Define "original" up front: novelty, usefulness, and fit with the brief.
- Score variations against your criteria; don't rely on vibes alone.
- Mix AI-generated sparks with your own references and observations.
- Run a final pass for clichés, tone drift, and over-explanation.
Key details from the study
- Models tested included GPT-4, Claude, Gemini, among others.
- More than 100,000 human participants were compared on the same tasks.
- Top-half humans beat every model on average; the top 10% widened the gap further, especially on writing.
- Creativity can be adjusted via technical settings and instruction design, not just model choice.
Learn more
Read the open-access paper in Scientific Reports: Divergent creativity in humans and large language models. Explore the DAT here: datcreativity.org.
Want to train prompts that produce fresher ideas?
Sharpen your workflow with practical prompt frameworks and creative drills. Start here: Prompt Engineering resources.
The takeaway: let AI handle breadth; you own depth. Use models to stretch the option space, then apply your judgment to make the work hit.
Your membership also unlocks: