May 18, 2026

Pelicans, pets, and pure AI chaos

The last six months in LLMs in five minutes

AI changed hands five times, and the comments are already demanding receipts

TLDR: In just six months, the top AI chatbot title changed hands repeatedly and coding helpers became useful enough for everyday work. Commenters were equal parts excited, skeptical, and hilarious — demanding proof, joking about pelican training data, and treating model updates like a dramatic breakup saga.

The big headline from Simon Willison’s whirlwind PyCon recap is simple: the “best” chatbot crown kept bouncing around like reality TV drama, switching between major AI companies five times in just one month. But the part that really got people buzzing wasn’t just who won the beauty pageant for smartest bot — it was the feeling that something suddenly got real. One commenter basically slammed the table and said, enough hype, “show me the money baby”: if these tools are really improving, people want clear dates, clear proof, and actual gains in daily life.

And then there’s the community’s favorite running joke: the now-famous pelican-on-a-bicycle test. Simon used it to compare image-making skills, saying no sane AI company would ever train for something that weird. The comments immediately turned that into a meme. One person joked that some poor human artist is probably drawing bicycle-riding pelicans right now for a giant AI lab. Another fired back that Simon’s blog is popular enough that the labs probably would train for it now. In other words: the community is half impressed, half suspicious, and fully ready to clown on the whole industry.

Then came the emotional rollercoaster review of the last few months: one commenter described model updates like a messy romance — “January Claude was euphoric,” “February Gemini cooked,” then “April the big bad nerf.” That mood says everything. People aren’t just using these tools anymore; they’re developing opinions, grudges, favorites, and betrayal arcs. The tech story is fast, but the comment-section soap opera is even faster.

Key Points

  • The article is based on Simon Willison’s annotated slides for a five-minute lightning talk at PyCon US 2026 covering the prior six months in LLMs.
  • It identifies November 2025 as an inflection point, with the perceived top model changing repeatedly among Claude Sonnet 4.5, GPT-5.1, Gemini 3, GPT-5.1 Codex Max, and Claude Opus 4.5.
  • The article says the most important November development was that coding agents became reliable enough for daily use.
  • OpenAI and Anthropic are described as having improved coding performance through Reinforcement Learning from Verifiable Rewards combined with Codex and Claude Code.
  • A repository first committed in late November as Warelay evolved into OpenClaw by February, helping popularize a category of personal AI assistants referred to as 'Claws.'

Hottest takes

“show me the money baby” — iekekke
“some human artist is being tasked with drawing illustrations of pelicans riding bicycles” — zarzavat
“April the big bad nerf” — throwaway2027
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.