I have a very large collection of pdfs which are impossible to really catalog. I would like to be able to just set an AI loose on the entire collection and then be able to query the AI about specific key words, topics, numbers, and ideas and have it find all the relevant documents for me to read myself.
Wouldn't mind training an AI specifically on that narrow dataset too, if that's possible.
They generally range from 1 page to 40 pages long.
So much has changed recently it's difficult for me to understand what exactly the best tool for this would be. Or if I'm still stuck with uploading this massive collection online somewhere for GPT4 or similar to analyze?
*Have a 16gb RTX 3060, so some local capability, I would pay for a better card if there is a good solution that requires it though. Chat With RTX supposedly misses quite a lot of data in the middle of documents so that is a last resort.
[link] [comments]