May 25, 2026

Random? More like drama by numbers

GPT Guesses Between 1 and 100

AI tried to be random and the comments instantly turned it into a roast

TLDR: GPT-4.1 was asked to pick a number from 1 to 100 thousands of times, and it kept favoring human-style favorites like 37 and 42 instead of spreading choices evenly. Commenters were split between "well, obviously," jokes about 37, and a surprisingly serious idea that these quirks could expose which AI model is being used.

A simple little experiment asked GPT-4.1 to pick a number between 1 and 100 10,000 times and, shocker, the bot did not behave like a fair lottery machine. Instead, it acted a lot like us: obsessed with "random-feeling" numbers, weirdly drawn to 37, 42, and 73, and apparently allergic to neat round numbers. In the funniest stat of the bunch, almost every multiple of 10 got picked zero times. So yes, the machine may be artificial, but the vibes are deeply human.

And the real fireworks? The comment section. One camp basically shrugged and said: of course a text-trained bot copies human quirks. The driest mic-drop came from one commenter who boiled the whole thing down to, "breaking: a language model trained on human-written stuff is not uniform." Ouch. Another reader didn’t even make it to the results, claiming the write-up itself felt "obviously AI generated" — which is exactly the kind of meta chaos this story was destined to attract.

Still, the thread wasn’t all eye-rolls. People came armed with jokes, memes, and side quests. There was a cheeky 37signals gag, a detour into Benford’s law, and one genuinely spicy idea: maybe these number-picking quirks could become a fingerprint for identifying what model is hiding behind a chatbot. So the headline may be "AI can’t do random," but the real plot twist is the crowd splitting between "duh," "lol," and "wait, this could actually matter."

Key Points

  • The project tested gpt-4.1 by asking it 10,000 times to output one integer between 1 and 100 and comparing the distribution with a uniform baseline.
  • The experiment used the OpenAI Responses API with temperature set to 1.0 and a fixed prompt, then validated outputs through a collect-clean-transform-stats pipeline.
  • The article reports that gpt-4.1 strongly deviated from a uniform distribution, with a chi-square result of χ² = 15,604 for N = 10,000 and df = 99, and p approximately 0.
  • The model reproduced several well-known human-favored numbers, including 37 and 42 at 4.0 times the uniform rate and 73 at 3.4 times the uniform rate.
  • The article says the model avoided round numbers even more strongly than humans: all multiples of 10 except 10 were picked zero times, and 10 appeared once.

Hottest takes

"37signals was pretty random" — alentodorov
"obviously AI generated" — gruez
"does not follow a uniform distribution" — penr0se
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.