ChatGPT and Bard are cool, but I have to manually feed them transcripts generated by Whisper to get summaries.
Furthermore, since the length of the transcript is often longer than the maximum character limit(s), I have to add additional prompts in between copying and pasting multipart transcripts.
Since these recordings are 10–15 years old, the audio quality isn't the best, but I think it's sufficient to generate transcripts + detect speech, if not, I might need an additional "audio cleaning" step as well.
I don't mind paying, and I'm above average in technical ability, so if anyone has any suggestions, I'd love to hear them.
Here's what the workflow would look like:
INPUT:
I will upload a folder containing 100+ MP3 files of podcasts with below-average audio quality.
OUTPUT:
I would like to get a Google Doc or a Text file with 1-page summaries of the most important points in bullet-point format corresponding to each episode.
Each page should be separated by some sort of divider, and the header should contain the filename for reference.
Ideally, there should be an existing Jupyter Notebook I could throw in Google Colab and do all of the above in a plug-and-play manner, but if not, I'd love to hear your thoughts.
Any tips?
Thanks!
[link] [comments]