November 6, 2025
Spreadsheet wars ignite
Show HN: TabPFN-2.5 – SOTA foundation model for tabular data
Spreadsheet AI just leveled up — fans cheer while skeptics ask if it beats AutoML
TLDR: TabPFN-2.5 claims bigger, faster, and more accurate “spreadsheet AI,” matching long, tuned AutoML runs while offering a speedy distilled version for production. Commenters cheered the end of painful tweaking, debated AutoML comparisons, and pressed on how text is handled locally vs. via the API — stakes: real-world usability.
TabPFN-2.5 landed and the crowd went loud. The devs claim this “spreadsheet AI” handles way bigger datasets (think up to 50k rows, 2k columns) and beats the usual tree-based tools while matching a heavyweight AutoML system that takes hours to tune. Translation: less fiddling, more accurate predictions, faster to deploy. One commenter summed up the vibe with a simple “Good stuff!” while another shouted what many feel: “tabular data is still underrated!”
But it wouldn’t be Hacker News without a scuffle. A veteran pointed out that with the old go-to, you “need to spend a lot of time feature engineering,” hinting that TabPFN’s promise is skipping that grind. Immediately, the comparison wars started: “how does it compare to automl tools?” chimed one skeptic, as others poked at the fine print. The spiciest thread? Text handling. A reader flagged the FAQ: locally, text is treated like categories; via the API, it gets “semantic meaning.” Cue raised eyebrows about convenience vs. control. Meanwhile, memes flew: “RIP feature engineering,” “Tree bois vs Transformers,” and “Is this AutoML’s final boss?” Love it or side-eye it, the takeaways are clear: bigger, faster, and a new distillation trick that turns the model into a tiny, speedy version you can actually ship — and the peanut gallery is here for the runtime receipts.
Key Points
- •TabPFN-2.5 scales to datasets up to 50,000 samples and 2,000 features, handling 20× more data cells than TabPFNv2.
- •On industry-standard benchmarks, it outperforms tuned tree-based models (XGBoost, CatBoost) and matches AutoGluon 1.4 accuracy.
- •TabPFN-2.5 introduces a distillation engine to convert the model into a compact MLP or tree ensemble with much lower latency.
- •Tabular foundation models perform training-free inference via in-context learning and are meta-trained for strong calibration.
- •The evolution from TabPFNv1 to TabPFNv2 expanded capabilities from small numeric-only data to practical use with heterogeneous real-world data.