AI Creativity

Oh, this is funny. First, the (AI-curated) report:

AI Surpasses Average Human Scores on Creativity Tests

Researchers found that advanced AI models now outperform the average human on standardized creativity assessments. While impressive, humans at the highest levels of creativity still exceed AI performance.

Key Points:

  • AI models scored higher than human averages on the Divergent Association Task, which measures originality through semantic distance
  • Tests with 100,000 participants confirmed AI’s ability to generate novel word associations, though top human performers remained ahead
  • Additional evaluations with haikus, movie plots, and flash fiction showed AI producing creative work, but human samples were rated more original overall

Why It Matters: This milestone suggests AI is expanding beyond analytical tasks into creative domains, challenging assumptions about human exclusivity. At the same time, the findings highlight AI’s role as a potential collaborator rather than a replacement for human creativity.

Know why that is funny?

Well per this site, the Divergent Association Task is basically “come up with 10 nouns that aren’t related in any way.”

Ignore for now that it’s a little funny that reverse-word-association is considered a slam-dunk measure of creativity.

Instead, recall that LLMs are trained billions of times over… and over… to “know” the association between words. The matrices of weights that make up their fundamental “programming” are the exact measures of “distance between words” (as well as between groups of words, and likely pieces of words) that would be useful for this task. With those weights (and an understanding of the tokens they link together), a human programmer could, after selecting a random word, max out the randomness and ace this task every time.

Therefore, every LLM, even the most basic, should absolutely blow away any except the most creative (or at least high-vocabulary) English-speaking human on the planet.

How do they do? They do slightly better than most people. (My guess is, compared to people with vocabularies close to what they have been trained with, they underperform them pretty significantly.)

They have all of the exact data that they would need to ace this exact task RIGHT THERE and ready to go; it’s practically part of their DNA. But they don’t have the ability to do the single piece of inverse logic that would allow them — with no additional training or information — to ace the task beyond most humans’ abilities to compete.

Because they can’t actually understand.