Hacker News

clickety_clack 6 days ago [ - ]

Please find a way to add speaker diarization, with a way to remember the speakers. You can do it with pyannote, and get a vector embedding of each speaker that can be compared between audio samples, but that’s a year old now so I’m sure there’s better options now!

yujonglee 6 days ago [ - ]

yeah that is on the roadmap!

williamsss 5 days ago [ - ]

I’ve done something similar recently, using speaker diarization to handle situations where two or more people share a laptop on a recorded call.

Ultimately, I chose a cloud-based GPU setup, as the highest-performing diarization models required a GPU to process properly. Happy to share more if you’re going that route.

clickety_clack 5 days ago [ - ]

What model did you use for diarization?