March 12, 2026
Intern or impostor?
Show HN: Understudy – Teach a desktop agent by demonstrating a task once
Your computer’s new intern watches once, then works—Linux fans are salty
TLDR: Understudy is a Mac desktop “intern” that learns a task from one demo and runs it across apps. The crowd split: Linux users grumble about Mac-only, skeptics doubt LLM smarts, and one asks if replies are AI-made. If it works, boring chores could vanish.
Understudy dropped as the “desktop intern” that watches you do a task once and then does it across your apps, browser, and command line—no fancy integrations, just copy your moves. It’s currently strongest on macOS with Layers 1–2 live, aiming to grow into a proactive helper. The crowd instantly split: some cheered the show-and-do demo, others rolled their eyes. One skeptic snapped, “2026 and we still pretend not to understand how LLMs work” (LLM = large language model), calling the whole premise wishful thinking. The Linux camp stormed in with the classic chant: Linux is underserved, macOS is overserved, why is everything Mac-only?
Cue the maker jumping onstage: “It’s fully agentic, not a dumb replay,” explained bayes-song, saying the bot chooses different routes (clicks, browser, shell) and replans when things fail—less brute force, more brains. Fans like jedreckoning cheered the demo: “cool idea.” But the hottest twist? A commenter asked if the dev’s replies were written by a human or secretly posted by an agent. Yes, the “desktop intern” might already be doing PR. Memes flew—“the intern’s about to unionize”—and the debate raged on Hacker News. Bold promise, spicy thread, popcorn secured. For now, it’s Mac-first; cross-platform hopes linger.
Key Points
- •Understudy is a teachable desktop agent that learns tasks from a single user demonstration and operates across GUI, browser, shell, and file system in a local runtime.
- •The product follows a five-layer progression from native operation to proactive autonomy; Layers 1–2 are implemented, Layers 3–4 are partial, and Layer 5 is a long-term goal.
- •Layer 1 capabilities on macOS include GUI control, browser automation via Playwright and a Chrome extension, shell access via bash, web retrieval, persistent semantic memory, messaging across eight platforms, scheduling with cron/timers, and subagents.
- •A planner selects the best execution route per step, allowing a single task to switch between routes within one session.
- •The demo runs on macOS using GPT-5.4 via Codex; the video is sped up, with the full version on Google Drive, and GUI grounding uses a dual-model, HiDPI-aware approach.