Postgres Postmaster does not scale

On-the-hour meeting rush jams logins; the crowd shouts: use a bouncer

TLDR: Recall.ai hit a quirky database bottleneck: Postgres’s single-thread “doorman” slowed new logins during on-the-hour traffic waves. Commenters split between “throw a connection proxy at it,” “rethink the data layout with sharding,” and “add jitter so everything doesn’t start at :00,” with a few joking about replacing the doorman entirely.

Millions of meetings start on the dot, Recall.ai’s servers surge, and suddenly the Internet is arguing about the world’s grumpiest doorman: Postgres’s single-threaded “postmaster.” The company found their database’s gatekeeper choking during those on-the-hour stampedes, making new connections wait a painful 10–15 seconds. They even built a giant, synchronized test to prove it. Result: the doorman was maxing out a whole CPU core just spawning new connections. Ouch.

That’s when the comments lit up. Team Pragmatic showed up first: “This is why you put a gate in front of the gate,” said folks like vel0city, pointing to connection pools like pgbouncer and Amazon’s RDS Proxy—basically a velvet rope that stops the crush from hitting the doorman all at once. Team Big Architecture fired back with “why not split the crowd?” Atherton wondered if they’re writing to a single database and suggested sharding per customer. Meanwhile, Team Chaos Tamer dropped a simple life hack: don’t do stuff at round hours—add jitter! One commenter even linked a guide on avoiding round-hour traffic spikes here.

Then came the spice. One brave soul asked, can’t we just replace the doorman altogether? Cue veteran eye-rolls. Another chimed in with the meme-y promise that “PgDog” will fix it all, prompting equal parts curiosity and side-eye. The vibe: use a bouncer now, rethink the club layout tomorrow, and stop scheduling parties at midnight.

Key Points

•Recall.ai experiences extreme synchronized load spikes as most meetings start on the hour, requiring immediate compute readiness.
•They observed sporadic 10–15s delays in PostgreSQL connection setup despite normal resource metrics and successful TCP handshakes.
•Investigation identified the PostgreSQL postmaster’s single-threaded main loop as a bottleneck under high worker churn.
•The postmaster can saturate a CPU core, slowing backend forking, connection establishment, and parallel worker handling.
•A production-like reproduction environment using Redis pub/sub and 3,000+ EC2 instances replicated the delay for instrumentation and analysis.

Hottest takes

"tools like pgbouncer were designed to solve" — vel0city

"cant we replace postmaster with something better?" — vivzkestrel

"One of the many problems PgDog will solve" — levkk

February 4, 2026

Doorman meltdown at rush hour

On-the-hour meeting rush jams logins; the crowd shouts: use a bouncer

Key Points

Hottest takes

February 4, 2026

Doorman meltdown at rush hour

Postgres Postmaster does not scale

On-the-hour meeting rush jams logins; the crowd shouts: use a bouncer

Key Points

Hottest takes

Save News