I’ve done something similar recently, using speaker diarization to handle situations where two or more people share a laptop on a recorded call.
Ultimately, I chose a cloud-based GPU setup, as the highest-performing diarization models required a GPU to process properly. Happy to share more if you’re going that route.
I’ve done something similar recently, using speaker diarization to handle situations where two or more people share a laptop on a recorded call.
Ultimately, I chose a cloud-based GPU setup, as the highest-performing diarization models required a GPU to process properly. Happy to share more if you’re going that route.
What model did you use for diarization?