December 6, 2025
Preprints go pop
HTML as an Accessible Format for Papers
arXiv puts papers on the web—cheers, side‑eye, and AI conspiracy vibes
TLDR: arXiv is offering papers as HTML for better accessibility and mobile reading, and plans to expand it across millions of papers. Commenters applaud the move but argue it’s not new, while hot takes say the real motive is making research easier for AI tools—sparking a function‑over‑form debate.
The internet’s nerdiest library, arXiv, is serving research papers as HTML, not just dusty PDFs—and the comment section is a circus. Accessibility advocates are thrilled: one user echoed arXiv’s urgent message that getting readable, screen‑reader‑friendly pages out now matters more than perfect polish. arXiv says it’ll backfill over 2 million papers, accept that some won’t convert, and begs readers to report issues—just don’t nitpick if HTML doesn’t look exactly like the PDF. The mood: function over form, with a dash of “finally!”
Then the plot twist: a sharp‑eyed commenter pointed out this isn’t brand new, reminding everyone HTML launched back in 2023, receipts included (link). Cue debate over motivation. One hot take claims it’s partly for AI bots—PDFs are clunky for large language models, and the “paywalled Adobe ecosystem” got roasted for being worse than newer AI tools. Meanwhile, supportive voices called the TeX‑to‑HTML challenge “Good work!” and casual readers chimed in with “nice find.” Jokes flew about PDFs being “digital stone tablets” and HTML finally letting research breathe on phones. Verdict from the crowd: a big accessibility win, with side‑eye for timing and whispers of AI strategy.
Key Points
- •arXiv launched an experimental HTML format for papers alongside existing PDFs.
- •HTML links appear on abstract pages, and authors can preview HTML during submission.
- •arXiv will gradually backfill HTML for over 2 million papers, with some conversions failing.
- •Conversion from TeX/LaTeX to HTML is automated, rapid, and may produce rendering errors.
- •Users can report issues via built-in tools and shortcuts; authors should follow LaTeX best practices.