Vintage Large Language Models

Time-traveling chatbots spark cheers, historian side-eye, and a 'Judge Bot' controversy

TLDR: A researcher wants “vintage” AI trained only on old data to test forecasting and re‑invent known ideas. The crowd is hyped but divided: fans love the time‑capsule experiment, while skeptics warn about biased archives and a wild “Judge Bot” proposal, making this both exciting and controversial.

The talk proposes “vintage” AI—chatbots trained only on old data—so we can send them “back” to the Romans, the Tudors, or even 2019. Cue the comments going full time-machine. One camp is pure hype: mountainriver basically throws confetti, while others dream up experiments with a 2019 model forecasting the pandemic. The creator’s plan sounds wild yet practical: use backdated models to test predictions and even re‑invent modern breakthroughs. You can watch the video, skim the tweet thread, and peek the slides.

But the historian squad rolled in with receipts. abeppu warns the “old” data we digitized might be biased—more famous letters, fewer everyday voices—so your Roman bot might sound like Caesar’s PR team. nxobject backs the idea but demands rigor: use social‑science methods so we don’t turn history into fan fiction. Meanwhile, ideashower wants to weaponize time‑boxed models to map how ugly biases (racism, sexism, imperialism) evolved—spicy, but useful.

Then came the courtroom chaos: digdugdirk pitches “LLMs as the Judge” over historical cases, making legal purists clutch their pearls. The memes? People joked about Tudor Twitter and a Gladiator bot that refuses to predict anything without a thumbs‑up. The crowd is split: genius time capsule vs. history cosplay with data leaks. Drama level: vintage and viral.

Key Points

  • A vintage LLM is trained solely on data up to a specified historical cutoff date.
  • Key challenges include sufficient pre-cutoff data and minimizing post-cutoff information leakage.
  • Multimodal inputs (e.g., images) can be included if they depict phenomena observable in the period without introducing present-day knowledge.
  • Scientific motivations include backtesting forecasting methods using an LLM trained to 2019 to evaluate predictions for 2020–2024.
  • Vintage LLMs could be scaffolded to attempt rediscovery of known post-cutoff innovations such as the web, quantum computing, blockchains, and transformers.

Hottest takes

"Very cool! I’ve been wanting to do this do a long time!" — mountainriver
"subject to strong selection bias" — abeppu
"using llms as the 'Judge'" — digdugdirk
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.