Show HN: Multimodal perception system for real-time conversation

It reads your vibe in real time — commenters are hyped and horrified

TLDR: Tavus’s Raven-1 claims to read tone and facial cues in real time to “understand” what you mean. Commenters are split between excitement over the slick demo and worries it’ll become emotion-scoring for hiring, with a side of retro Mac jokes fueling the spectacle — a big deal for how machines judge humans.

Tavus just dropped Raven-1, a demo of a “vibe-reading” AI that doesn’t just hear your words; it watches your face, clocking tone, hesitation, and expression in real time to guess what you actually mean. The Show HN crowd lit up instantly: “the demo is wild… kudos,” cheered one. Another simply gasped “Holy,” which kind of says it all. The early vibe? Jaw-dropped awe at how fast and fluid it looks.

Then came the HR panic. One top comment worried that this means companies won’t just outsource tasks to machines — they’ll outsource empathy too. Imagine job interviews where the webcam scores your nerves and “compassion.” Part of the same commenter even admitted it might reduce bias, but the other part felt queasy about a model judging stress and awkwardness. The split in the thread: is this fairer hiring or a “vibe police” nightmare? Meanwhile, a curious “Wonder how it works?” set off questions about microphones, webcams, and privacy — as in, where do your feelings go?

Amid the drama, there were memes and nostalgia. One eagle-eyed commenter spotted old Macs in the backdrop and a soundtrack channeling “Chariots of Fire,” turning the demo into a retro-tech montage. It’s Black Mirror energy with ‘80s Apple vibes — and the comments are having a field day.

Key Points

  • Tavus launched Raven-1, a multimodal perception system.
  • Raven-1 interprets both spoken content and delivery (how it’s said).
  • The system analyzes visual cues such as facial expression and appearance while speaking.
  • It captures tone, expression, hesitation, and context in real time.
  • A demo shows Raven-1 inferring user intent during live conversation.

Hottest takes

"the demo is wild... kudos" — jesserowe
"Holy" — Johnny_Bonk
"great, now not only will e.g., HR/screening/hiring hand-off ... they'll now outsource the things that require any sort of emotional understanding ... to a model too" — ycombiredd
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.