February 16, 2026

A million tokens, a million takes

Qwen3.5: Towards Native Multimodal Agents

Big AI drops: locals cheer ‘laptop-ready’, others cry paywall and eye strain

TLDR: Qwen3.5 opens a huge model with image+text skills and efficient design, sparking excitement about fast local use. Commenters split over a hosted-only 1M memory window, question the “15k environments” claim, and roast the light-grey website—while GGUF packs appear to make it laptop-friendly fast.

Qwen3.5 just landed with open weights for a mega model (397 billion total, only 17 billion “awake” per turn), promising native image+text skills and agent smarts across 201 languages. The crowd? Loud. One camp is buzzing that this thing might actually run fast on a MacBook—cue the hype line: “Sonnet-level, local, and fast.” Another camp is side-eyeing the fine print: the 1M context (how much the AI can remember in one go) appears reserved for the hosted Qwen3.5-Plus, while the open model lists a smaller window. Paywall vibes, anyone?

Instant tinkering erupted: a community dev dropped GGUF packs (smaller files to run locally) and a how-to guide within hours—see this GGUF pack and the guide. Meanwhile, skeptics poked at the blog’s claim of training on “15k RL environments” (RL = reinforcement learning, think digital practice worlds): “Name them!” became the refrain. And the biggest villain of the launch? Not GPUs—the website’s light-grey text. Users joked they needed sunglasses to read the benchmarks.

So yes, impressive charts and a clever “many experts, one brain” design. But the drama is the real show: locals racing to make it run on laptops, doubters questioning the 1M memory promise, and everyone roasting the UI while refreshing for more benchmarks.

Key Points

  • Qwen released the Qwen3.5 series and open weights for Qwen3.5-397B-A17B, a native vision-language model.
  • The model uses a hybrid architecture combining linear attention (via Gated Delta Networks) with sparse mixture-of-experts.
  • Despite 397B total parameters, only 17B are activated per forward pass, improving inference efficiency.
  • Language and dialect support expanded from 119 to 201.
  • Qwen3.5-Plus, hosted on Alibaba Cloud Model Studio, offers a default 1M context window, built-in tools, and adaptive tool use.

Hottest takes

"The OSS version seems to have has 262144 context len, I guess for the 1M they'll ask u to use ya..." — ggcr
"They mention they used 15k environments." — mynti
"a Sonnet 4.5 level model that runs local and fast" — bertili
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.