November 29, 2025
Bananas for PDFs
Show HN: Nano PDF – A CLI Tool to Edit PDFs with Gemini's Nano Banana
Talk to your PDFs; devs cheer while wallets wince
TLDR: Nano PDF edits presentation PDFs via simple prompts using Google’s Gemini “Nano Banana,” then restores searchable text. Commenters are thrilled but demand video proof and worry about paid API costs (~$0.15 per image), debating whether the time saved outweighs the bill.
Hacker News just peeled a banana-shaped surprise: Nano PDF, a simple tool you run with text commands that lets you talk to your slides. Say “change the chart to a bar graph,” and Google’s Gemini 3 Pro Image—nicknamed “Nano Banana”—edits the page, then stitches it back into your file. It even restores searchable text with OCR (optical character recognition) so your words don’t vanish. The crowd went bananas: lxe shouted, “This is nuts and I absolutely love this,” while others marveled at the wizardry of turning a PDF into a picture, editing it, then turning it back. Fans also loved multi-page speed and new slides that auto-match your style.
But then the wallet wars started. sultson warned the magic isn’t free: Gemini’s paid tier means “roughly $0.15 per image,” so batch edits could add up fast. The demo police swooped in: treetalker begged for “clearer examples,” and mentalgear demanded an animated screengrab like oryx. Meanwhile, itsmevictor pitched a red‑pen future—LLMs (large language models) auto-finding typos and literally underlining them in red. The vibe? Half giddy, half cautious. Nano PDF looks like the banana‑split for slide pain, but the community wants receipts, pricing sanity, and proof it won’t mush your PDFs.
Key Points
- •Nano PDF is a CLI tool that edits and adds slides to PDFs using natural-language prompts with Google’s Gemini 3 Pro Image model.
- •The workflow renders PDF pages to images (Poppler), applies AI edits, restores searchable text via Tesseract OCR, and stitches pages back into the PDF.
- •It supports multi-page, parallel processing and configurable image resolution (4K/2K/1K) to balance quality and cost.
- •Installation requires pip and a paid Google Gemini API key with billing enabled via Google Cloud; free-tier keys don’t support image generation.
- •Options include using full document context, selecting style reference pages, setting output names, and enabling/disabling Google Search.