I have LOTS of recordings of vocalists from my music project and I’m interested in making voice models using these recordings to create harmonies and fix recording errors. What’s the best way I can go about this?

I really like the spongebob AI stuff using RVC-2 but I've only used it for the funny voice models, I haven't tried making my own. I want to experiment with this, but haven't look into it yet because I'm wondering if there is something better out there for what I'm trying to do?

I like the RVC one because I can sing my parts and swap it to be any other voice, which is what I'd like to do (no text to voice stuff).

Also I know the training data for a lot of the voice models for this come from the TV show and other clear recordings which are compressed and equalized properly. However I'd like to train the AI using raw, uncompressed wav files that generally have a lot of headroom and dynamic range (but does vary a lot). Its ok if the output sounds similar as a result because I want to apply compression and eq AFTER the fact anyway. But if this would affect training it then I'd be willing scrape through all these voice recordings and process them for loudness and clarity beforehand so the model does better.

Anyway, any guidance would be greatly appreciated because I'm new to AI. I have basic dev experience (no AI stuff) and I'm mostly skilled in music production, but I would love to try to have a tool like this in my arsenal. If there's anywhere else I can post about this I'd like to know too. Thanks!

submitted by /u/Dr_lawlz
[link] [comments]