May 9, 2026
Search got eyes, comments got claws
Gemini API File Search is now multimodal
Google says its search can now “see” images, but commenters want basic search fixed first
TLDR: Google’s Gemini search tool can now understand both images and text, sort files better, and show the exact page an answer came from. Commenters were less dazzled than irritated, with one demanding basic search fixes first and another using the moment to push a privacy-friendly rival.
Google just gave its Gemini file search a glow-up: apps can now look through pictures and documents together, sort them with custom labels, and even point to the exact page where an answer came from. In plain English, that means a business could finally find that one image with the right vibe or prove exactly where a quote came from in a giant report. It’s a very practical update — and yet the real fireworks were in the comments, where the community instantly turned this product launch into a therapy session.
The loudest reaction was basically: cool feature, but can we fix the basics first? One commenter was openly annoyed that Google’s own AI Studio search still only checks conversation titles, not what’s inside them, and complained that even browser search shortcuts are acting flaky. Ouch. That gave the whole announcement a familiar tech-launch energy: shiny new superpowers on stage, frustrated users in the back yelling, “Can it just do normal search?”
Then came the classic comment-section plot twist: the rival pitch. Another user swooped in to say this is exactly why people want options beyond big cloud services, plugging a local alternative that promises privacy, speed, and freedom from subscriptions. The vibe was part product feedback, part startup infomercial, and part “we did it before Google” chest-thumping. So while Google was selling a smarter digital filing cabinet, the crowd was busy debating trust, privacy, and whether the real innovation is simply making search not feel broken. Peak internet.
Key Points
- •Google expanded the Gemini API File Search tool to support multimodal retrieval over text and images.
- •The tool now supports custom metadata so developers can filter unstructured data with key-value labels at query time.
- •Google introduced page-level citations that connect model responses to the source page in indexed documents.
- •The multimodal search capability is powered by the Gemini Embedding 2 model.
- •The article includes a Python example showing how to create a multimodal file store with the Gemini API client.