Statistical Process Control in Python

Old-school stats steal the show as hot springs get smart

TLDR: A Python workshop shows how simple stats can keep Japanese hot springs on-brand by tracking temperature, pH, and sulfur. Commenters turned it into a showdown, with claims that Big Tech swapped many AI detectors for classic methods small teams can manage, while others cheered practical guides and clean design.

A chill Python workshop about tracking Japanese hot spring quality with simple stats and tidy charts suddenly boiled over into a bigger fight: old-school statistics vs flashy AI. The tutorial walks through using pandas, plotnine, and scipy with a Kagoshima onsen dataset, and even shares helper tools via GitHub. But the comments turned up the heat like an “Extra Hot” soak.

One veteran claimed they “replaced thousands” of deep-learning (AI) detectors at a Big Tech giant with classic statistical process control—arguing fewer knobs to tweak, less babysitting, and a tiny team can run it all. Cue the steam: commenters rallied around the idea that simplicity wins, especially when you just need to know if the water’s too hot, too cool, or too eggy-sulfur. Another pro chimed in from the trenches of clinical research, saying traditional stats are still “bread and butter” when data is small and messy—rare disease studies aren’t exactly TikTok-scale, after all.

Not all was drama. One reader gushed, “love the look and feel of your page!!”, while another dropped a peace-offering link to a beginner-friendly practitioner’s guide for anyone new to the discipline. Meanwhile, the community had jokes: extra hot springs = extra hot takes, and the sulfur “rotten egg” smell got compared to an AI model meltdown. Verdict? The onsen got a glow-up, but the real heat was the comments insisting that simple, readable tools beat black-box hype—at least when your business is keeping bathwater on-brand.

Key Points

  • The workshop teaches Statistical Process Control (SPC) in Python using pandas, plotnine, and SciPy.
  • Custom functions are sourced from a GitHub repository and added to the Python path for import.
  • Onsen quality benchmarks include temperature categories, pH classifications, and sulfur thresholds cited from Serbulea and Payyappallimana (2012).
  • A Kagoshima Prefecture onsen dataset includes monthly samples over 15 months with 20 random observations per month for temperature, pH, and sulfur.
  • Descriptive statistics (mean and standard deviation) are introduced as foundational measures for process evaluation.

Hottest takes

We have successfully replaced thousands of complicated deep net time series based anomaly detectors at a FANG with statistical (nonparametric, semiparametric) process control ones. — srean
Classical stats is still bread and butter for lot of smallish dataset in clinical datasets. — kasperset
On a sidenote, love the look and feel of your page!! — tamagotchiguy
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.