November 2, 2025
MoE money, MoE problems
Tongyi DeepResearch – open-source 30B MoE Model that rivals OpenAI DeepResearch
Open‑source ‘DeepResearch’ drops: old news, copycat, or DIY playground
TLDR: Alibaba released Tongyi DeepResearch, an open-source agent claiming parity with OpenAI’s Deep Research, complete with code and weights. Comments split between “old news,” “just a Qwen MoE fine-tune,” and “Deep Research is a workflow,” while tinkerers asked about DIY hosting — signaling real interest in open alternatives.
Alibaba’s Tongyi DeepResearch landed with a big boast: an open‑source web agent that says it matches OpenAI’s “Deep Research” on tough tests, with high scores on academic puzzles and browsing challenges. The links are live — GitHub, Hugging Face, and a slick showcase — and the methodology reads like a full DIY cookbook: synthetic training data, supervised fine‑tuning (teaching by examples), and reinforcement learning (trial‑and‑error practice). But the comment section immediately turned into the main event. “Old news,” shrugged one, noting the weights were out weeks ago. Others fired back with semantics: is “Deep Research” a product or just a workflow you run on whatever model?
Then came the identity check: “It’s a Qwen 3 Mixture‑of‑Experts fine tune,” said a skeptic, calling it fancy dressing on known tech. Meanwhile, weekend engineers asked the real Saturday question: can you self‑host on a dusty 2080 Ti and still have fun? Cue memes about the “8x Blackwell Lamborghini” you don’t own and the joy of constraints. Big‑picture thinkers wondered if this sparks an explosion of purpose‑built AI or just gets absorbed by the next mega model. Verdict from the crowd? Intrigued, confused, slightly petty — in other words, classic internet.
Key Points
- •Tongyi DeepResearch is introduced as a fully open-source web agent claiming performance comparable to OpenAI’s DeepResearch.
- •Reported benchmark scores: 32.9 on HLE, 43.4 on BrowseComp, 46.7 on BrowseComp-ZH, and 75 on xbench-DeepSearch.
- •Training pipeline uses synthetic data across Agentic CPT, SFT, and a full-stack RL stage with algorithmic innovations and automated data curation.
- •Inference supports a vanilla ReAct mode (no prompt engineering) and an advanced Heavy Mode for test-time scaling to maximize reasoning and planning.
- •Data strategy includes AgentFounder, entity-anchored knowledge memory, multi-style Q&A generation, and action synthesis enabling offline exploration of reasoning-action space without costly human annotation.