Google releases Gemma 4 open models

Tiny, open, and offline—devs cheer, nitpick, and race to benchmark

TLDR: Google’s Gemma 4 is open-source, runs offline on phones and small devices, and even listens to voice—drawing cheers for its permissive license and lean speed. The community is hyped but competitive, with benchmarkers sharpening knives and debates brewing over how it stacks up to past Gemmas and rivals like Qwen.

Google just dropped Gemma 4, a family of open AI models that can run offline on phones and tiny computers with voice and vision built in—and the comment section went feral. The vibes: half victory parade, half drag race. One camp is screaming “finally!” over the Apache 2.0 license and the return of base (non-instruction) models for custom tuning. Another camp is already timing laps: devs like danielhanchen rolled out quantized versions and swear they “work really well,” while minimaxir claims the small E4B flavor beats the old 27B model across benchmarks at a fraction of the size.

The plot twist? Benchmark beef. Veteran tester jwr is itching to throw Gemma 4 at a spam filter and reminds everyone that while Gemma 3 was strong, it got eclipsed—and Qwen (another popular model family) “always had more variance,” stirring a friendly rivalry. Privacy hawks love the local voice input for translation apps, and tinkerers are giddy about turning a gaming PC into a “local-first AI server.” There’s nitpicking about model sizes (E2B/E4B for mobile, a 31B beast for “agent” tasks, a 26B mixture-in-between), and yes, jokes that your Raspberry Pi just became a mini coworker. Google’s “enterprise-grade security” line drew nods, but the crowd’s real heartbeat is simple: fast, open, and pocket-sized—now prove it on the leaderboard.

Key Points

  • Gemma 4 supports autonomous agents with native function calling.
  • Models provide audio and visual understanding for multimodal applications.
  • They can run offline with near-zero latency on edge devices like phones, Raspberry Pi, and Jetson Nano.
  • Optimized for consumer GPUs to enable local-first AI servers and advanced reasoning for IDEs and coding assistants.
  • Gemma 4 follows Google’s proprietary infrastructure security protocols for enterprise-grade security and reliability.

Hottest takes

"beats the old 27B in every benchmark at a fraction of parameters." — minimaxir
"Best thing is that this is Apache 2.0" — NitpickLawyer
"qwen models always had more variance." — jwr
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.