For instance, if I'm recording an interview between two people, and I have something like Whisper recording the discussion, can it break out the dialogue between the speakers? Seems like this would be a fairly simple feature, but I'm not sure if it exists.
Doesn't have to be Whisper per se, but is there a known S2T model or solution for this?
[link] [comments]