December 27, 2025
Loopholes, Llamas, and a $20B plot twist
Nvidia's $20B Antitrust Loophole (Not an Acquisition)
Nvidia buys brains, skips the company — commenters say it's a regulatory dodge
TLDR: Nvidia paid $20B for Groq’s tech and top execs without buying the company. Commenters say it skirts regulators, debate SRAM-only chips for fast AI replies, and meme the Grok/Groq confusion—arguing this could reshape cheap, speedy AI while dodging antitrust headaches.
Jensen Huang just dropped a $20B plot twist: Nvidia scooped up Groq’s tech and top execs but explicitly didn’t acquire the company. The comments lit up. The hottest take? User ossa-ma says this “not-an-acquisition” is a surgical workaround for CFIUS and antitrust — pointing to Groq’s Saudi contracts and calling the big price tag the cost of speed. Others cheered the business jujitsu: pay for brains, skip the baggage. Meanwhile, skeptics shouted “AI bubble!” and waved popcorn gifs.
Cue the confusion: half the thread devolved into a name war — Grok ≠ Groq — with users correcting each other like hall monitors. terabytest and danr4 became the unofficial fact-checkers of the day. Another chorus, led by LarsDu88, framed this as part of a growing trend of “non-acquisitions,” like Google rehiring Noam Shazeer or OpenAI scooping Windsurf’s talent.
For the non-nerds: Groq’s chips stash model data on fast on-chip memory (SRAM), skipping the slower, pricey external memory (DRAM/HBM). That means snappy answers and lower energy, but no training and smaller models only. Commenters argue if inference (answering questions) keeps dominating and memory prices spike, this bet looks brilliant. The vibe: Nvidia just bought speed and certainty — and left the cloud baggage behind.
Key Points
- •Nvidia is paying $20B for Groq’s IP and to hire its executive team but is not acquiring Groq as a company.
- •The deal includes all Groq IP/patents and non-exclusive licensing of its inference technology; GroqCloud is excluded and remains independent under CFO Simon Edwards.
- •Groq’s LPU architecture uses large on-chip SRAM to avoid off-chip DRAM/HBM, enabling deterministic, low-latency, energy-efficient inference.
- •Reported LPU performance includes Llama 2 7B at ~750 tokens/sec, Llama 2 70B at ~300 tokens/sec, and Mixtral 8x7B at ~480 tokens/sec, with ~10x energy efficiency gains.
- •Trade-offs include limited memory capacity (about 14GB SRAM per rack) and no training capability, targeting 7B–70B inference workloads rather than very large models.