Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

Pelican draw‑off: laptop underdog outsketches AI giant—and the comments squawk

TLDR: A laptop‑run Qwen 3.6 drew a better “pelican on a bike” SVG than Claude Opus 4.7, sparking a hilarious but pointed debate: locals cheer open models winning quirky tasks, skeptics say Opus still dominates coding and realism. It matters because it shows benchmarks are messy—and open, local AI is catching up in surprising ways.

A goofy “pelican riding a bicycle” art test just turned into a full‑blown drama: a locally run, roughly 21GB “quantized” (compressed) version of Alibaba’s Qwen 3.6 model spit out a cleaner SVG pelican—and a slick flamingo on a unicycle—than Anthropic’s brand‑new Claude Opus 4.7. The author swears this is a joke benchmark, but even they admit the sunglasses comment in Qwen’s SVG made them suspicious—and the internet did the rest.

Cue the squawking. Team Opus rushed in: ericpauley insists the Opus flamingo “actually sits on the pedals and seat” with realistic bike parts, saying Qwen’s art breaks physics and maybe overfits to pelicans. Team Qwen flexed the home‑rig cred: comandillos says Qwen 35B has been “impressively good” on their Mac, especially for tool use and agents (think: the model running little tasks on its own). Meanwhile, the pragmatists showed up with receipts. jbellis drops coding stats—Opus crushed difficult programming problems while Qwen barely improved—arguing a cute bird doesn’t make a smarter bot. And the comic relief? 19qUq begs for a new benchmark: “MechaStalin on a tricycle,” please.

Bottom line: one silly drawing test just exposed a real split—local, open models are getting shockingly capable for niche tasks, but big, expensive models still rule in heavy lifting. Also, the pelican meme may need new wheels.

Key Points

  • Two newly released models—Qwen3.6-35B-A3B and Claude Opus 4.7—were compared on SVG prompts for a pelican on a bicycle and a flamingo on a unicycle.
  • A 20.9GB quantized Qwen3.6-35B-A3B GGUF model by Unsloth ran locally on a MacBook Pro M5 via LM Studio and outperformed Opus for these tasks, per the author.
  • Claude Opus 4.7, including a run with thinking_level: max, did not match Qwen’s SVG outputs in this informal test.
  • The author notes the “pelican benchmark” is humorous and non-rigorous, though it had loosely correlated with general model usefulness in the past.
  • The result is not taken as evidence of overall superiority; it shows Qwen’s local, quantized model can be better for these specific SVG prompts than Opus 4.7.

Hottest takes

"Opus flamingo is actually on the pedals and seat with functional spokes and beak." — ericpauley
"not at all in the class of qwen 3.5 27b dense (26 solved) let alone opus (95/98 solved, for 4.6)" — jbellis
"How about switching to MechaStalin on a tricycle?" — 19qUq
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.