Nicholas Carlini – Black-hat LLMs [video]

AI gone rogue? Commenters blame RAM chaos and “good guys” gone bad

TLDR: Anthropic’s Nicholas Carlini warns AI tools can help bad actors automate attacks, raising serious safety concerns. In the comments, one buzzy post blames soaring RAM prices on secretive AI operators and sparks a fight over closed, monitored systems versus open models with fewer guardrails—aka who controls the future of AI.

Nicholas Carlini of Anthropic took the stage at [un]prompted 2026 to warn that “black-hat LLMs”—AI tools used by bad actors—can now automate attacks once only humans could pull off. But the real fireworks erupted in the comments, where one sharply worded post set the tone: it’s not just hackers we should worry about, it’s the stampede to run giant AIs everywhere. One user argued RAM prices are spiking because “not so known players” quietly built the muscle to run heavy models—and then hinted at a plot twist: “the good ones turned bad” when their bets didn’t pay off. Drama, much?

That same take poked the bear on the hottest fault line: closed vs. open AI. The commenter claimed the big labs (Anthropic, Google, OpenAI) charge per use and can “inject censoring” or monitor suspicious activity, while open models lack those guardrails. Translation for non-nerds: some folks want safer, chaperoned AIs; others want the keys to the engine—no nanny, no meter running. The video says AI can supercharge attacks; the comments clap back with a broader conspiracy vibe about who is running these models, why our memory sticks cost a fortune, and whether the “good guys” stayed good. If Carlini brought the caution tape, the thread brought the popcorn.

Key Points

  • Nicholas Carlini, a Research Scientist at Anthropic, presents a talk titled “Black-hat LLMs.”
  • The talk is part of the [un]prompted 2026 event and is published on the unprompted channel.
  • The description asserts that LLMs can automate attacks that were previously only possible by human adversaries.
  • The video runtime is approximately 40 minutes and 32 seconds.
  • The content centers on malicious or adversarial uses of large language models.

Hottest takes

“RAM prices skyrocketed because there is really a lot of not so known players with infrastructure to run heavy language models” — gmuslera
“the good ones turned bad because their speculation didn't worked out so well” — gmuslera
“Anthropic, Google, OpenAI, etc charge by their use, could have some censoring injected, can monitor suspicious activity” — gmuslera
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.