Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

Your browser gets an AI roommate — fans cheer, security folks sweat

TLDR: An on-device browser extension puts an AI helper in your tabs to read pages and even click buttons, no cloud needed. Commenters are split between loving the privacy and warning about security risks, with many pushing for a safer SDK or Chrome’s native Prompt API-style integration.

Gemma Gem wants to move into your browser and start clicking your buttons — literally. It’s an on-device AI assistant that reads pages, fills forms, runs page scripts, and answers questions, all without sending data to the cloud. Privacy lovers swooned, but the comments quickly turned into a tug-of-war between “this rules” and “this terrifies me.” One camp cheered the no-keys, no-cloud setup and begged for an SDK so apps can tap a local, private helper. Another camp slammed the idea of letting a tiny model control real pages, calling it a security minefield. The loudest eyebrow raise: giving an AI “hands” (page clicks and JavaScript) feels like inviting a helpful gremlin into your tabs. Enter the plot twist: devs pointed out Chrome’s own Prompt API trial, hinting a future where this is built-in, not bolted on. That sparked a side feud: extension agent vs. background daemon — do you want your AI tied to a flaky tab, or running safely in the background? Meanwhile, spectators laughed that they’re “impressed but also hiding the ‘Buy Now’ button,” and joked about toggling the extension’s “Thinking” switch like it’s giving the bot coffee. The vibe: equal parts wow, whoa, and where’s the off switch.

Key Points

  • Gemma Gem is a Chrome extension that runs Google’s Gemma 4 model entirely on-device via WebGPU, requiring no API keys or cloud.
  • The assistant can read webpages, click elements, fill forms, run JavaScript, and answer questions about any visited site.
  • Setup uses pnpm to install and build, then the extension is loaded from chrome://extensions in developer mode.
  • Architecture splits responsibilities among an offscreen document (model + agent), service worker (routing, screenshots, JS), and content script (UI + DOM tools).
  • Debugging is supported with detailed logs accessible via Chrome’s inspect tools, with offscreen logs showing model and tool execution details.

Hottest takes

"Not sure if I actually want this" — montroser
"full JS execution privileges on a live page is a bit sketchy" — veunes
"would love to see someone build it as some kind of an SDK" — emregucerr
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.