April 4, 2026

From lab bench to comment trench

Training mRNA Language Models Across 25 Species for $165

For $165, AI reads “genetic text” in 25 species—cue hype, doom, and confusion

TLDR: An open team trained a budget AI to read “genetic text” across 25 species and shared the code, claiming big performance for cheap. Comments split between hype about industry‑changing tools, doom jokes about runaway bio‑tech, and confused devs asking what real‑world use this actually unlocks—and why it matters now.

A tiny price tag, a big flex: an open team says they trained AI to “read and write” genetic-style text across 25 species for just $165 and a weekend of compute. Their top model beat a rival called ModernBERT, and they even teased a species-aware system you can actually run. The crowd reaction? Pure chaos. One camp is screaming sci‑fi, with a commenter deadpanning “gray goo of the future,” as if this is the trailer for a biotech apocalypse. Another camp is pure hype: someone spotted a teaser called “CodonJEPA” and declared it’s going to “break the whole industry,” like the next iPhone for biology. And then there’s the giant middle: curious devs asking, “Cool… but what would I even do with this?”

In plain English: this is like building a spell‑checker for the code cells read, possibly helping scientists design genetic messages that cells understand better. The team claims it’s cheaper and faster than you’d think, with full post and code. But the comments quickly turned into a cage match—doom jokers vs. lab‑coat optimists vs. practical folks who want real use cases. One skeptic wondered why these specialized models work here when we still struggle in healthcare and economics. Verdict: an impressive demo meets a community split between “game‑changer,” “god no,” and “tell me what button to press.”

Key Points

  • An end-to-end protein AI pipeline was built for structure prediction, sequence design, and codon optimization.
  • CodonRoBERTa-large-v2 achieved perplexity 4.10 and Spearman CAI correlation 0.40, surpassing ModernBERT.
  • Models were scaled to 25 species and four production models were trained in 55 GPU-hours.
  • Total reported compute cost was $165 for training across species.
  • A species-conditioned system was developed and released with complete results, architecture details, and runnable code.

Hottest takes

“gray goo of the future” — HocusLocus
“JEPA is going to break the whole industry :D” — khalic
“What makes these domain models work when we don’t have good ones for health care...?” — simianwords
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.